Implementing feedforward networks with TensorFlow
Feedforward networks can be easily implemented using TensorFlow by defining placeholders for hidden layers, computing the activation values, and using them to calculate predictions. Let's take an example of classification with a feedforward network:
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float", shape=[None, y_size])
weights_1 = initialize_weights((x_size, hidden_size), stddev)
weights_2 = initialize_weights((hidden_size, y_size), stddev)
sigmoid = tf.nn.sigmoid(tf.matmul(X, weights_1))
y = tf.matmul(sigmoid, weights_2)
Once the predicted value tensor has been defined, we calculate the cost function:
cost = tf.reduce_mean(tf.nn.OPERATION_NAME(labels=<actual value>, logits=<predicted value>))
updates_sgd = tf.train.GradientDescentOptimizer(sgd_step).minimize(cost)
Here, OPERATION_NAME could be one of the following:
- tf.nn.sigmoid_cross_entropy_with_logits: Calculates sigmoid cross entropy on incoming logits and labels:
sigmoid_cross_entropy_with_logits(
_sentinel=None,
labels=None,
logits=None,
name=None
)Formula implemented is max(x, 0) - x * z + log(1 + exp(-abs(x)))
_sentinel: Used to prevent positional parameters. Internal, do not use.
labels: A tensor of the same type and shape as logits.
logits: A tensor of type float32 or float64. The formula implemented is ( x = logits, z = labels) max(x, 0) - x * z + log(1 + exp(-abs(x))).
- tf.nn.softmax: Performs softmax activation on the incoming tensor. This only normalizes to make sure all the probabilities in a tensor row add up to one. It cannot be directly used in a classification.
softmax = exp(logits) / reduce_sum(exp(logits), dim)
logits: A non-empty tensor. Must be one of the following types--half, float32, or float64.
dim: The dimension softmax will be performed on. The default is -1, which indicates the last dimension.
name: A name for the operation (optional).
tf.nn.log_softmax: Calculates the log of the softmax function and helps in normalizing underfitting. This function is also just a normalization function.
log_softmax(
logits,
dim=-1,
name=None
)
logits: A non-empty tensor. Must be one of the following types--half, float32, or float64.
dim: The dimension softmax will be performed on. The default is -1, which indicates the last dimension.
name: A name for the operation (optional).
- tf.nn.softmax_cross_entropy_with_logits
softmax_cross_entropy_with_logits(
_sentinel=None,
labels=None,
logits=None,
dim=-1,
name=None
)
_sentinel: Used to prevent positional parameters. For internal use only.
labels: Each rows labels[i] must be a valid probability distribution.
logits: Unscaled log probabilities.
dim: The class dimension. Defaulted to -1, which is the last dimension.
name: A name for the operation (optional).
The preceding code snippet computes softmax cross entropy between logits and labels. While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. For exclusive labels, use (where one and only one class is true at a time) sparse_softmax_cross_entropy_with_logits.
- tf.nn.sparse_softmax_cross_entropy_with_logits
sparse_softmax_cross_entropy_with_logits(
_sentinel=None,
labels=None,
logits=None,
name=None
)
labels: Tensor of shape [d_0, d_1, ..., d_(r-1)] (where r is the rank of labels and result) and dtype, int32, or int64. Each entry in labels must be an index in [0, num_classes). Other values will raise an exception when this operation is run on the CPU and return NaN for corresponding loss and gradient rows on the GPU.
logits: Unscaled log probabilities of shape [d_0, d_1, ..., d_(r-1), num_classes] and dtype, float32, or float64.
The preceding code computes sparse softmax cross entropy between logits and labels. The probability of a given label is considered exclusive. Soft classes are not allowed, and the label's vector must provide a single specific index for the true class for each row of logits.
- tf.nn.weighted_cross_entropy_with_logits
weighted_cross_entropy_with_logits(
targets,
logits,
pos_weight,
name=None
)
targets: A tensor of the same type and shape as logits.
logits: A tensor of type float32 or float64.
pos_weight: A coefficient to use on the positive examples.
This is similar to sigmoid_cross_entropy_with_logits() except that pos_weight allows a trade-off of recall and precision by up or down weighting the cost of a positive error relative to a negative error.