楼主: oliyiyi
1912 0

Introduction to Recurrent Networks in TensorFlow [推广有奖]

版主

已卖:2997份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
27300 个
通用积分
31671.3517
学术水平
1454 点
热心指数
1573 点
信用等级
1364 点
经验
384134 点
帖子
9629
精华
66
在线时间
5508 小时
注册时间
2007-5-21
最后登录
2025-7-8

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

楼主
oliyiyi 发表于 2016-7-26 12:00:29 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

A straightforward, introductory overview of implementing Recurrent Neural Networks in TensorFlow.

By Danijar Hafner, Independent Machine Learning Researcher.

Recurrent networks like LSTM and GRU are powerful sequence models. I will explain how to create recurrent networks in TensorFlow and use them for sequence classification and sequence labelling tasks. If you are not familiar with recurrent networks, I suggest you take a look at Christopher Olah’s great post first. On the TensorFlow part, I also expect some basic knowledge. The official tutorials are a good place to start.

from tensorflow.models.rnn import rnn_cellnum_hidden = 200num_layers = 3dropout = tf.placeholder(tf.float32)network = rnn_cell.GRUCell(num_hidden)  # Or LSTMCell(num_hidden)network = rnn_cell.DropoutWrapper(network, output_keep_prob=dropout)network = rnn_cell.MultiRNNCell([network] * num_layers)

Defining the Network


To use recurrent networks in TensorFlow we first need to define the network architecture consisting of one or more layers, the cell type and possibly dropout between the layers.

Unrolling in Time


We can now unroll this network in time using the rnn operation. This takes placeholders for the input at each timestep and returns the hidden states and output activations for each timestep.

from tensorflow.models.rnn import rnnmax_length = 100# Batch size times time steps times data width.data = tf.placeholder(tf.float32, [None, max_length, 28])outputs, states = rnn.rnn(network, unpack_sequence(data), dtype=tf.float32)output = pack_sequence(outputs)state = pack_sequence(states)

TensorFlow uses Python lists of one tensor for each timestep for the interface. Thus we make use oftf.pack() and tf.unpack() to split our data tensors into lists of frames and merge the results back to a single tensor.

def unpack_sequence(tensor):    """Split the single tensor of a sequence into a list of frames."""    return tf.unpack(tf.transpose(tensor, perm=[1, 0, 2]))def pack_sequence(sequence):    """Combine a list of the frames into a single tensor of the sequence."""    return tf.transpose(tf.pack(sequence), perm=[1, 0, 2])

As of version v0.8.0, TensorFlow provides rnn.dynamic_rnn as an alternative to rnn.rnn that does not actually unroll the compute graph but uses a loop graph operation. The interface is the same except that you don’t need unpack_sequence() and pack_sequence() anymore, it already operates on single tensors. In the following sections, I will mention the modifications you need to make in order to use dynamic_rnn.

Sequence Classification


For classification, you might only care about the output activation at the last timestep, which is justoutputs[-1]. The code below adds a softmax classifier ontop of that and defines the cross entropy error function. For now we assume sequences to be equal in length but I will cover variable length sequences in another post.

in_size = num_hiddenout_size = int(target.get_shape()[2])weight = tf.Variable(tf.truncated_normal([in_size, out_size], stddev=0.1))bias = tf.Variable(tf.constant(0.1, shape=[out_size]))prediction = tf.nn.softmax(tf.matmul(outputs[-1], weight) + bias)cross_entropy = -tf.reduce_sum(target * tf.log(prediction))

When using dynamic_rnn, this is how to get the last output of the recurrent networks. We can’t useoutputs[-1] because unlike Python lists, TensorFlow doesn’t support negative indexing yet. Here is thecomplete gist for sequence classification.

output, _ = rnn.dynamic_rnn(network, data, dtype=tf.float32)output = tf.transpose(output, [1, 0, 2])last = tf.gather(output, int(output.get_shape()[0]) - 1)

Sequence Labelling


For sequence labelling, we want a prediction for each timestamp. However, we share the weights for the softmax layer across all timesteps. This way, we have one softmax layer ontop of an unrolled recurrent network as desired.

in_size = num_hiddenout_size = int(target.get_shape()[2])weight = tf.Variable(tf.truncated_normal([in_size, out_size], stddev=0.1))bias = tf.Variable(tf.constant(0.1, shape=[out_size]))predictions = [tf.nn.softmax(tf.matmul(x, weight) + bias) for x in outputs]prediction = pack_sequence(predictions)

If you want to use dynamic_rnn instead, you cannot apply the same weights and biases to all time steps in a Python list comprehension. Instead, we must flatten the outputs of each time step. This way time steps look the same as examples in the trainng batch to the weight matrix. Afterwards, we reshape back to the desired shape.

max_length = int(self.target.get_shape()[1])num_classes = int(self.target.get_shape()[2])weight, bias = self._weight_and_bias(self._num_hidden, num_classes)output = tf.reshape(output, [-1, self._num_hidden])prediction = tf.nn.softmax(tf.matmul(output, weight) + bias)prediction = tf.reshape(prediction, [-1, max_length, num_classes])

Since this is a classification task as well, we keep using cross entropy as our error function. Here we have a prediction and target for every timestep. We thus compute the cross entropy for every timestep first and then average. Here is the complete gist for sequence labelling.

cross_entropy = -tf.reduce_sum(    target * tf.log(prediction), reduction_indices=[1])cross_entropy = tf.reduce_mean(cross_entropy)

That’s all. We learned how to construct recurrent networks in TensorFlow and use them for sequence learning tasks. Please ask any questions below if you couldn’t follow.

Bio: Danijar Hafner is a Python and C++ developer from Berlin interested in Machine Intelligence research. He recently released aneural networks library, but he likes creating new things in general.



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:introduction troduction Recurrent Networks CURRENT overview familiar sequence create

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-19 18:33