There are amazing introductions, courses and blog posts on Deep Learning. I will name some of them in the resources sections, but this is a different kind of introduction: a weird introduction. But why weird? Maybe because it won’t follow the “normal” structure of a Deep Learning post, where you start with the math, then go into the papers, the implementation and then to applications. I think telling a story can be much more helpful than just throwing information and formulas everywhere. So let’s begin.
Deep Learning Timeline
Deep Learning (DL) is such an important field for Data Science, AI, Technology and our lives right now, and it deserves all of the attention is getting. Please don’t say that deep learning is just adding a layer to a neural net, and that’s it; magic!. I’m hoping that after reading this you’ll have a different perspective of what DL is.
I created this timeline based on several papers and other resources with the purpose of showing that Deep Learning is much more than just Neural Networks. There has been really theoretical advances, software and hardware improvements that were necessary for us to get to this day.
Breakthroughs of Deep Learning and Representation Learning
Let’s start by defining the word learning. In the context of Machine Learning, the word “learning” describes an automatic search process for better representations of the data you are analyzing and studying (please have this in mind, is not making a computer learn). An what is a representation? It’s a way to look at data.
This is something very important to have in mind, deep learning is representation learning using different kinds of neural networks and optimize the hyperparameters of the net to get (learn) the best representation for our data. This wouldn’t be possible without the amazing breakthroughs that led us to the current state of Deep Learning. Here I name some of them:
Concept # 1: Back Propagation.
Concept # 2: Better initialization of the parameters of the nets. Something to remember: The initialization strategy should be selected according to the activation function used (see next concept).
Concept # 3: Better activation functions. This mean, better ways of approximating the functions faster leading to faster training process.
Concept # 4: Dropout. Better ways of preventing overfitting and more.
Concept # 5: Convolutional Neural Nets (CNNs).
Concept # 6: Residual Nets (ResNets).
Concept # 7: Region Based CNNs. Used for object detection and more.
Concept # 8: Recurrent Neural Networks (RNNs) and LSTMs.
BTW: It was shown by Liao and Poggio (2016) that ResNets == RNNs, arXiv:1604.03640v1.
Concept # 9: Generative Adversarial Networks (GANs).
Concept # 10: Capsule Networks.
Getting stuff done with Deep Learning
One of the most important moments for this field was the creation and open sourced of TensorFlow.
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them.
Tensors, defined mathematically, are simply arrays of numbers, or functions, that transform according to certain rules under a change of coordinates.
But in the scope of Machine Learning and Deep Learning a tensor is a generalization of vectors and matrices to potentially higher dimensions. Internally, TensorFlow represents tensors as n-dimensional arrays of base data types.
Thinking about the future of Deep Learning (for programming or building applications), I really think GUIs and AutoML are the near future of getting things done with Deep Learning. I love coding, but I think the amount of code we will be writing next years will decay.
We spent many hours worldwide programming the same stuff over and over again, so I think these two features GUIs and AutoML will help Data Scientist on getting more productive and solving more problems.