上QQ阅读APP看书，第一时间看更新

A brief history of ANNs

Inspired by the working principles of biological neurons, Warren McCulloch and Walter Pitts proposed the first artificial neuron model in 1943 in terms of a computational model of nervous activity. This simple model of a biological neuron, also known as an artificial neuron (AN), has one or more binary (on/off) inputs and one output only.

An AN simply activates its output when more than a certain number of its inputs are active. For example, here we see a few ANNs that perform various logical operations. In this example, we assume that a neuron is activated only when at least two of its inputs are active:

ANNs performing simple logical computations

The example sounds too trivial, but even with such a simplified model, it is possible to build a network of ANs. Nevertheless, these networks can be combined to compute complex logical expressions too. This simplified model inspired John von Neumann, Marvin Minsky, Frank Rosenblatt, and many others to come up with another model called a perceptron back in 1957.

The perceptron is one of the simplest ANN architectures we've seen in the last 60 years. It is based on a slightly different AN called a Linear Threshold Unit (LTU). The only difference is that the inputs and outputs are now numbers instead of binary on/off values. Each input connection is associated with a weight. The LTU computes a weighted sum of its inputs, then applies a step function (which resembles the action of an activation function) to that sum, and outputs the result:

The left-side figure represents an LTU and the right-side figure shows a perceptron

One of the downsides of a perceptron is that its decision boundary is linear. Therefore, they are incapable of learning complex patterns. They are also incapable of solving some simple problems like Exclusive OR (XOR). However, later on, the limitations of perceptrons were somewhat eliminated by stacking multiple perceptrons, called MLP.