Home | History | Annotate | only in /external/tensorflow/tensorflow/contrib/crf
Up to higher level directory
NameDateSize
__init__.py21-Aug-20181.8K
BUILD21-Aug-20181.4K
python/21-Aug-2018
README.md21-Aug-20182.6K

README.md

      1 # CRF
      2 
      3 The CRF module implements a linear-chain CRF layer for learning to predict tag sequences. This variant of the CRF is factored into unary potentials for every element in the sequence and binary potentials for every transition between output tags.
      4 
      5 ### Usage
      6 
      7 Below is an example of the API, which learns a CRF for some random data. The linear layer in the example can be replaced by any neural network.
      8 
      9 
     10 ```python
     11 import numpy as np
     12 import tensorflow as tf
     13 
     14 # Data settings.
     15 num_examples = 10
     16 num_words = 20
     17 num_features = 100
     18 num_tags = 5
     19 
     20 # Random features.
     21 x = np.random.rand(num_examples, num_words, num_features).astype(np.float32)
     22 
     23 # Random tag indices representing the gold sequence.
     24 y = np.random.randint(num_tags, size=[num_examples, num_words]).astype(np.int32)
     25 
     26 # All sequences in this example have the same length, but they can be variable in a real model.
     27 sequence_lengths = np.full(num_examples, num_words - 1, dtype=np.int32)
     28 
     29 # Train and evaluate the model.
     30 with tf.Graph().as_default():
     31   with tf.Session() as session:
     32     # Add the data to the TensorFlow graph.
     33     x_t = tf.constant(x)
     34     y_t = tf.constant(y)
     35     sequence_lengths_t = tf.constant(sequence_lengths)
     36 
     37     # Compute unary scores from a linear layer.
     38     weights = tf.get_variable("weights", [num_features, num_tags])
     39     matricized_x_t = tf.reshape(x_t, [-1, num_features])
     40     matricized_unary_scores = tf.matmul(matricized_x_t, weights)
     41     unary_scores = tf.reshape(matricized_unary_scores,
     42                               [num_examples, num_words, num_tags])
     43 
     44     # Compute the log-likelihood of the gold sequences and keep the transition
     45     # params for inference at test time.
     46     log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood(
     47         unary_scores, y_t, sequence_lengths_t)
     48 
     49     # Compute the viterbi sequence and score.
     50     viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(
     51         unary_scores, transition_params, sequence_lengths_t)
     52 
     53     # Add a training op to tune the parameters.
     54     loss = tf.reduce_mean(-log_likelihood)
     55     train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
     56 
     57     session.run(tf.global_variables_initializer())
     58 
     59     mask = (np.expand_dims(np.arange(num_words), axis=0) <
     60             np.expand_dims(sequence_lengths, axis=1))
     61     total_labels = np.sum(sequence_lengths)
     62 
     63     # Train for a fixed number of iterations.
     64     for i in range(1000):
     65       tf_viterbi_sequence, _ = session.run([viterbi_sequence, train_op])
     66       if i % 100 == 0:
     67         correct_labels = np.sum((y == tf_viterbi_sequence) * mask)
     68         accuracy = 100.0 * correct_labels / float(total_labels)
     69         print("Accuracy: %.2f%%" % accuracy)
     70 ```
     71