Deep learning is a set of techniques to implement machine learning that is based on artificial neural networks. These AI systems loosely model the way that neurons interact in the brain. Neural networks have many (“deep”) layers of simulated interconnected neurons, hence the term “deep learning.” Whereas earlier neural networks had only three to five layers and dozens of neurons, deep learning networks can have ten or more layers, with simulated neurons numbering in the millions.
There are several types of machine learning: supervised learning, unsupervised learning, and reinforcement learning, with each best suited to certain use cases. Most current practical examples of AI are applications of supervised learning. In supervised learning, often used when labeled data are available and the preferred output variables are known, training data are used to help a system learn the relationship of given inputs to a given output—for example, to recognize objects in an image or to transcribe human speech.
Unsupervised learning is a set of techniques used without labeled training data—for example, to detect clusters or patterns, such as images of buildings that have similar architectural styles, in a set of existing data.
In reinforcement learning, systems are trained by receiving virtual “rewards” or “punishments,” often through a scoring system, essentially learning by trial and error. Through ongoing work, these techniques are evolving.
The reason deep learning is so successful is that there’s very little design that goes into neural networks. The NN machine discovers the most useful pattern from raw data. The coder does tell it high-level features. We let it search through all of its training data and find those patterns that lead to the highest accuracy in solving the problem.
What you gain is accuracy. Deep learning is especially good at performing tasks such as classifying raw, unstructured data like photos, videos and audio files. It has also made great strides in processing and generating written text, performing machine translation, and learning to play games at a professional level. Anyone who has had experience programming knows how hard it is to perform any of those tasks with classical, rule-based approaches.
The concept of deep learning has been around since the 1950s. Take a brief look at how it evolved from concept to actuality and the key people who made it happen.
It is too early to write a full history of deep learning—and some of the details are contested—but we can already trace an admittedly incomplete outline of its origins and identify some of the pioneers. They include Warren McCulloch and Walter Pitts, who as early as 1943 proposed an artificial neuron (PDF–1.2MB), a computational model of the “nerve net” in the brain. Bernard Widrow and Ted Hoff at Stanford University developed a neural-network application by reducing noise in phone lines in the late 1950s.
Around the same time, Frank Rosenblatt, an American psychologist, introduced the idea of a device called the Perceptron (PDF–1.55MB), which mimicked the neural structure of the brain and showed an ability to learn. MIT’s Marvin Minsky and Seymour Papert then put a damper on this research in their 1969 book Perceptrons by showing mathematically that the Perceptron could only perform very basic tasks. Their book also discussed the difficulty of training multilayer neural networks.
In 1986, Geoffrey Hinton at the University of Toronto, along with colleagues David Rumelhart and Ronald Williams, solved this training problem with the publication of a now-famous back-propagation training algorithm—although some practitioners point to a Finnish mathematician, Seppo Linnainmaa, as having invented back-propagation already in the 1960s. Yann LeCun at New York University pioneered the use of neural networks on image-recognition tasks and his 1998 paper (PDF–430KB) defined the concept of convolutional neural networks, which mimic the human visual cortex. In parallel, John Hopfield popularized the “Hopfield” network (PDF–1.13MB), which was the first recurrent neural network. This was subsequently expanded upon by Jürgen Schmidhuber and Sepp Hochreiter in 1997 with the introduction of the long short-term memory (LSTM) (PDF–388KB), greatly improving the efficiency and practicality of recurrent neural networks. Hinton and two of his students in 2012 highlighted the power of deep learning (PDF–1.35MB) when they obtained significant results in the well-known ImageNet competition, based on a dataset collated by Fei-Fei Li and others. At the same time, Jeffrey Dean and Andrew Ng were doing breakthrough work on large-scale image recognition (PDF–263KB) at Google Brain.
Neural networks, the software structure that underlies deep learning, proved to be very good at generating human-like descriptions of digital imagery. However they weren’t always correct. They would occasionally fail, perhaps mistaking a man in the image for a woman.