Go Machine Learning Projects
上QQ阅读APP看书,第一时间看更新

What is a model?

All models are wrong; but some are useful.

Now it would be very remarkable if any system existing in the real world could be  exactly  represented by any simple model. However, cunningly chosen parsimonious models often do provide remarkably useful approximations. For example, the law  relating pressure , volume  and temperature  of an "ideal" gas via a constant R is not exactly true for any real gas, but it frequently provides a useful approximation and furthermore its structure is informative since it springs from a physical view of the behavior of gas molecules.

For such a model there is no need to ask the question "Is the model true?". If "truth" is to be the "whole truth" the answer must be "No". The only question of interest is "Is the model illuminating and useful?".
- George Box (1978)

Model train are a fairly common hobby, despite being lampooned by the likes of The Big Bang Theory. A model train is not a real train. For one, the sizes are different. Model trains do not work exactly the same way a real train does. There are gradations of model trains, each being more similar to actual trains than the previous. 

A model is in that sense a representation of reality. What do we represent it with? By and large, numbers. A model is a bunch of numbers that describes reality, and a bit more.

Every time I try to explain what a model is I inevitably get responses along the lines of "You can't just reduce us to a bunch of numbers!". So what do I mean "numbers"?

Consider the following right angle triangle:

How do we describe all right angle triangles? We might say something like this:

This says that the sum of all angles in a right angle adds up to 180 degrees, and there exists an angle that is 90 degrees. This is sufficient to describe all right angle triangles in Cartesian space.

But is the description the triangle itself? It is not. This issue has plagued philosophers ever since the days of Aristotle.  We shall not enter into a philosophical discussion for such a discussion will only serve to prolong the agony of this chapter. 

So for our purposes in this chapter, we'll say that a model is the values that describe reality, and the algorithm that produces those values. These values are typically numbers, though they may be of other types as well.