Python：Advanced Guide to Artificial Intelligence

上QQ阅读APP看书，第一时间看更新

Bayesian networks

A Bayesian network is a probabilistic model represented by a direct acyclic graph G = {V, E}, where the vertices are random variables X_i, and the edges determine a conditional dependence among them. In the following diagram, there's an example of simple Bayesian networks with four variables:

Example of Bayesian network

The variable x₄ is dependent on x₃, which is dependent on x₁ and x₂. To describe the network, we need the marginal probabilities P(x₁) and P(x₂) and the conditional probabilities P(x₃|x₁,x₂) and P(x₄|x₃). In fact, using the chain rule, we can derive the full joint probability as:

The previous expression shows an important concept: as the graph is direct and acyclic, each variable is conditionally independent of all other variables that are not successors given its predecessors. To formalize this concept, we can define the function Predecessors(x_i), which returns the set of nodes that influence x_i directly, for example, Predecessors(x₃) = {x₁,x₂} (we are using lowercase letters, but we are considering the random variable, not a sample). Using this function, it's possible to write a general expression for the full joint probability of a Bayesian network with N nodes:

The general procedure to build a Bayesian network should always start with the first causes, adding their effects one by one, until the last nodes are inserted into the graph. If this rule is not respected, the resulting graph can contain useless relations that can increase the complexity of the model. For example, if x₄ is caused indirectly by both x₁ and x₂, therefore adding the edges x₁ → x₄ and x₂ → x₄ could seem a good modeling choice; however, we know that the final influence on x₄ is determined only by the values of x_3,whose probability must be conditioned on x₁ and x₂, hence we can remove the spurious edges. I suggest reading Introduction to Statistical Decision Theory, Pratt J., Raiffa H., Schlaifer R., The MIT Press to learn many best practices that should be employed in this procedure.