What is Deep Learning? What are its prospects, applications, and critics? Should you dive into it? What should you know about Deep Learning before you take it up?
This post focuses on answering all these questions.
What is Deep Learning?
Artificial Neural Networks is an algorithm in Machine Learning that is making giant leaps in the field of Artificial Intelligence. ANNs attempt to “learn” by replicating the way neurons work in our brain. It attempts to make use of the connections among neurons to “teach” itself based on past data. ANNs make use of “hidden layers” to depict the connections between different neurons. Connections between the neurons are either strong or weak depending on the weight assigned to each if that weight crosses a certain set threshold and the confidence with which these weights are assigned. The weights are assigned based on the huge data that is fed into the ANN at the input layer. This serves as the “training data”.
The ANN learns from this data and makes assumptions to make predictions about future data. In neural networks, hidden layers that are more than 2 are usually ignored as further layers have diminishing effects.
Deep Learning is a branch of Machine Learning that is a Neural Network, but that makes use of a large number of hidden layers. The more the hidden layers, the more accurately the outcome is predicted. In Deep Learning, a number of layers do not lead to diminishing effects because the effect of ‘Higher level of Abstractions’ overcomes this drawback to making astoundingly accurate predictions.
Beyond this, not much is known about the working of Artificial Neural Networks or Deep Learning, much like not much is known about the working of the brain. The exact working of the layers is unclear; hence they are called hidden layers. They are like a “black box””, which means that the outcome can be seen, but the exact working of it is a mystery.
Why the Hype?
Application of Deep learning has resulted in favourable outcomes for complex problems with large datasets.
“A very well-known Psychologist and Computer Scientist-Geoffrey Hinton and his students managed to beat the status quo prediction systems on five well-known datasets: Reuters, TIMIT, MNIST, CIFAR and ImageNet. This covers speech, text and image classification – and these are quite mature datasets, so a win on any of these gets some attention. A win on all of them gets a lot of attention.” – Jack Rae, Top writer of Quora 2014.
Google’s AlphaGo (i.e. an AI designed on Deep Learning) beats Lee Sodol- the World Champion at Go. Go is a game of profound complexity. It is a game of intuition – i.e. an obscure concept for computers. As per the website, Go is 10100 times more complex than Chess!
Many high-profile people and companies (such as Google and Facebook) have been working on it.
What are the Characteristics of Deep learning?
1. Distributed Representations
The idea behind distributed representation is that – observed information is a composition of a multitude level of factors. In Deep Learning, the neural networks process these factors at different levels of hidden layers.
For example, if the information says ‘Black Toyota’, then the factors can be the car’s color, model name, shape, etc. At each level of hidden layer, the neural network processes these factors. But, in reality, no one knows how the neural network decomposes information into factors.
The picture below gives you an idea of what these features are:
Therefore, this is sometimes an excellent choice for unsupervised learning.
2. Automated Featured Engineering
A Feature is any piece of information that is potentially useful for prediction. E.g.- voice clips for speech recognition, texts for NLP, images for computer vision, etc.
Feature Engineering refers to the set of tasks to design features into a suitable format for Machine Learning. Its process requires one to efficiently create features, use them, check for possible improvements and do this cycle all over again. It also requires the good knowledge of the topics on which one does the analysis.
Why is Deep Learning better than Feature Engineering?
Feature Engineering has worked brilliantly so far. But like many methods and technologies, it is quite powerless when faced with Big Data. It cannot work on the huge mounds of data being produced today. The bigger the data, the greater is the cost, in Feature Engineering. So, it is a costly and time-consuming process. Hence, Feature Engineering may not be too beneficial for Big Data.
Deep Learning, on the other hand, doesn’t require manual Feature Engineering; it comes with Automated Feature Engineering. It translates the data into an intermediate representation and removes the necessity of Feature Engineering. So, you don’t have to waste time designing and converting features for hours/ days.
In the present era of data explosion, Feature Engineering is fast becoming obsolete. Thus, the need for Deep Learning is more crucial than ever.
3. Higher levels of Abstraction
Deep learning learns from abstractions at different levels. The concept of ‘abstraction’ has no proper definition. One way to understand this is the intuition/knowledge gained by the computer at each hidden layer.
The greater the number of hidden layers, the greater are the levels of abstraction. Due to this concept, deep learning is a quite vague model.
There are no proper statistical estimates or validity checks. This concept is derived from the functioning of the brain. For the same reason, it results in different outcomes, which are sometimes outstanding (such as Google’s speech recognition) and sometimes incomprehensible.
4. Big Data
Deep Learning requires huge amounts of data to complement its huge network. Otherwise, it suffers from over-fitting. It is also used for unsupervised learning. In the present age, there is an explosion of unlabeled data. Deep Learning is skilled at working with unlabeled data. Unlike other tools, it needs only a small subset of labelled data. It learns from this data. Thus, Deep Learning is an excellent choice for unsupervised Big Data learning.
What are the Applications of Deep Learning?
Deep learning emulates human brain’s perceptions. It processes images, speech, and text in a similar fashion to the brain. Thus, it is a good choice for datasets relating to images, sounds, and texts. Application of Deep Learning has led to outstanding results in the fields of Computer Vision, Automated Speech Recognition, Natural Language Processing, Drug Discovery, Image Classification and Toxicology. To know how Deep Learning is applied in these fields, read here. It has its applications in state-of-the-art Speech Recognition Systems like Microsoft’s Cortana, Google Brain, Apple’s Siri, etc.
Google’s DeepMind beat the world champion at Go through the application of Deep Learning in its program called AlphaGo. It also made a neural network that plays video games in a fashion that is similar to that of humans.
Its application in drug discovery is still in early stages but is very promising.
What are the problems with Deep Learning?
When trained with small datasets, Deep Learning suffers from many over-fitting. Despite automated feature engineering, overfitting is a troubling phenomenon. Regularisation (l1, l2, and dropout) helps in generalising the model and thus decreases overfitting.
• Computation Time
As there are many hyper-parameters to tune (Hidden layers, Hidden units, etc.), Deep Learning takes a long time to compute.
Deep Neural Networks are so uniformly structured such that at each layer of the network, artificial neurons perform pretty much the same computations. It involves massive parallel computing. This requirement for parallel computations is met with the application of GPUs (i.e. Graphics Processing Units). The traditional approach of CPUs with inefficient parallelization and low clock speed is economically infeasible. For comparison of speeds, the best CPUs have about 50 GBps while the best GPUs have 750 GBps of memory speed.
• No strong theoretical base
Neural networks have no strong theoretical base. They are replicated from human brain cells, and it isn’t even an identical replication. In fact, no one knows how knowledge is organised in the neural network. The concept of abstractions is still obscure.
There are no proper statistical estimates for this type of learning. It is difficult to estimate the standard error in our prediction and the goodness-of-fit for the model.
Therefore, even though the results are very promising, it is something of a black box, and a lot of work is yet to be done in the field to truly understand it.
• Unsupervised learning
Even though Deep Learning can produce outstanding results, sometimes, they can be arbitrary. There are no particular tests to check the validity of this kind of unsupervised learning. Therefore, Deep Learning is not always a hundred percent reliable and accurate.
So, should you consider diving into Deep Learning?
Deep Learning is still at an early stage. Despite being a working model that is largely employed, the actual working is still unknown. It is also a time-consuming model with a large number of neural networks. Thus for large amounts of data, it consumes greater resources. However, Big Data has a positive impact on Deep Learning (no over-fitting).
With the growth in performance caps of hardware each day, application of Deep learning has become more and more feasible. With large players in the field (such as Google and Microsoft) showing increasing amounts of interest in the field, and cutting-edge technologies around the corner being developed(such as Quantum Computing), Deep Learning has a very promising future. Deep Learning has already caught the world’s attention with excellent results for some of the world’s toughest problems. Its popularity goes only uphill from here.
Google trends for the term ‘Deep Learning.’
Best of all, Deep Learning doesn’t require great programming expertise. Excellent libraries for Deep Learning are available in both Python, and R. Know more by reading