Monday, October 10, 2016

Understanding Deep Learning as a Stack of Logistic Regression Models

So, I had an interesting self realization today. I sat down to implement a multi class classification system (the details of which shall remain classified). I was working with text data and there was no way to directly map it to one of the target classes. So I decided to build a series of classifiers, where starting by classifying at a more broad level, I will drill down towards more specific set of classes with each classifier. Essentially it was like a chain of UNIX shell pipes, you take the output of one classifier, feed to the next and so on. So for example, first I detect one of the more broad classes, then towards more specific ones, until I get to one of the leaf nodes of this tree of the classes.

After getting done, I realized, the deep layered neural networks in vogue these days, essentially do the same thing for you automatically. For example a deep convolutional network for face recognition first starts with detecting the edges in the starting layers, then moves on to detecting the contours and curves and then to more complex features. It all makes sense now :-D

Moral of the lesson, if you have a ton of data, just give it to a deep neural network and it will do all the feature engineering for you. And if you don't have enough data, then you need to do all the feature engineering by hand and build a stack of classifiers, like I had to do.

No comments:

Post a Comment