发帖

楼主: oliyiyi

1935 0

Introduction to Recurrent Networks in TensorFlow [推广有奖]

1关注
185
粉丝

版主

已卖：2999份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库 其他...

计量文库

0%

威望: 7 级
论坛币: -38600 个
通用积分: 31675.2236
学术水平: 1454 点
热心指数: 1573 点
信用等级: 1364 点
经验: 384234 点
帖子: 9629
精华: 66
在线时间: 5508 小时
注册时间: 2007-5-21
最后登录: 2025-7-8

楼主

oliyiyi 发表于 2016-7-26 12:00:29 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

A straightforward, introductory overview of implementing Recurrent Neural Networks in TensorFlow.

By Danijar Hafner, Independent Machine Learning Researcher.

Recurrent networks like LSTM and GRU are powerful sequence models. I will explain how to create recurrent networks in TensorFlow and use them for sequence classification and sequence labelling tasks. If you are not familiar with recurrent networks, I suggest you take a look at Christopher Olah’s great post first. On the TensorFlow part, I also expect some basic knowledge. The official tutorials are a good place to start.

from tensorflow.models.rnn import rnn_cellnum_hidden = 200num_layers = 3dropout = tf.placeholder(tf.float32)network = rnn_cell.GRUCell(num_hidden) # Or LSTMCell(num_hidden)network = rnn_cell.DropoutWrapper(network, output_keep_prob=dropout)network = rnn_cell.MultiRNNCell([network] * num_layers)

Defining the Network

To use recurrent networks in TensorFlow we first need to define the network architecture consisting of one or more layers, the cell type and possibly dropout between the layers.

Unrolling in Time

We can now unroll this network in time using the rnn operation. This takes placeholders for the input at each timestep and returns the hidden states and output activations for each timestep.

from tensorflow.models.rnn import rnnmax_length = 100# Batch size times time steps times data width.data = tf.placeholder(tf.float32, [None, max_length, 28])outputs, states = rnn.rnn(network, unpack_sequence(data), dtype=tf.float32)output = pack_sequence(outputs)state = pack_sequence(states)

TensorFlow uses Python lists of one tensor for each timestep for the interface. Thus we make use oftf.pack() and tf.unpack() to split our data tensors into lists of frames and merge the results back to a single tensor.

def unpack_sequence(tensor): """Split the single tensor of a sequence into a list of frames.""" return tf.unpack(tf.transpose(tensor, perm=[1, 0, 2]))def pack_sequence(sequence): """Combine a list of the frames into a single tensor of the sequence.""" return tf.transpose(tf.pack(sequence), perm=[1, 0, 2])

As of version v0.8.0, TensorFlow provides rnn.dynamic_rnn as an alternative to rnn.rnn that does not actually unroll the compute graph but uses a loop graph operation. The interface is the same except that you don’t need unpack_sequence() and pack_sequence() anymore, it already operates on single tensors. In the following sections, I will mention the modifications you need to make in order to use dynamic_rnn.

Sequence Classification

For classification, you might only care about the output activation at the last timestep, which is justoutputs[-1]. The code below adds a softmax classifier ontop of that and defines the cross entropy error function. For now we assume sequences to be equal in length but I will cover variable length sequences in another post.

in_size = num_hiddenout_size = int(target.get_shape()[2])weight = tf.Variable(tf.truncated_normal([in_size, out_size], stddev=0.1))bias = tf.Variable(tf.constant(0.1, shape=[out_size]))prediction = tf.nn.softmax(tf.matmul(outputs[-1], weight) + bias)cross_entropy = -tf.reduce_sum(target * tf.log(prediction))

When using dynamic_rnn, this is how to get the last output of the recurrent networks. We can’t useoutputs[-1] because unlike Python lists, TensorFlow doesn’t support negative indexing yet. Here is thecomplete gist for sequence classification.

output, _ = rnn.dynamic_rnn(network, data, dtype=tf.float32)output = tf.transpose(output, [1, 0, 2])last = tf.gather(output, int(output.get_shape()[0]) - 1)

Sequence Labelling

For sequence labelling, we want a prediction for each timestamp. However, we share the weights for the softmax layer across all timesteps. This way, we have one softmax layer ontop of an unrolled recurrent network as desired.

in_size = num_hiddenout_size = int(target.get_shape()[2])weight = tf.Variable(tf.truncated_normal([in_size, out_size], stddev=0.1))bias = tf.Variable(tf.constant(0.1, shape=[out_size]))predictions = [tf.nn.softmax(tf.matmul(x, weight) + bias) for x in outputs]prediction = pack_sequence(predictions)

If you want to use dynamic_rnn instead, you cannot apply the same weights and biases to all time steps in a Python list comprehension. Instead, we must flatten the outputs of each time step. This way time steps look the same as examples in the trainng batch to the weight matrix. Afterwards, we reshape back to the desired shape.

max_length = int(self.target.get_shape()[1])num_classes = int(self.target.get_shape()[2])weight, bias = self._weight_and_bias(self._num_hidden, num_classes)output = tf.reshape(output, [-1, self._num_hidden])prediction = tf.nn.softmax(tf.matmul(output, weight) + bias)prediction = tf.reshape(prediction, [-1, max_length, num_classes])

Since this is a classification task as well, we keep using cross entropy as our error function. Here we have a prediction and target for every timestep. We thus compute the cross entropy for every timestep first and then average. Here is the complete gist for sequence labelling.

cross_entropy = -tf.reduce_sum( target * tf.log(prediction), reduction_indices=[1])cross_entropy = tf.reduce_mean(cross_entropy)

That’s all. We learned how to construct recurrent networks in TensorFlow and use them for sequence learning tasks. Please ask any questions below if you couldn’t follow.

Bio: Danijar Hafner is a Python and C++ developer from Berlin interested in Machine Intelligence research. He recently released aneural networks library, but he likes creating new things in general.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：introduction troduction Recurrent Networks CURRENT overview familiar sequence create

Introduction to Recurrent Networks in TensorFlow [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

初级学术勋章

初级热心勋章

初级信用勋章

中级信用勋章

中级学术勋章

中级热心勋章

高级热心勋章

高级学术勋章

高级信用勋章

特级热心勋章

特级学术勋章

特级信用勋章

本版微信群

Introduction to Recurrent Networks in TensorFlow [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

初级学术勋章

初级热心勋章

初级信用勋章

中级信用勋章

中级学术勋章

中级热心勋章

高级热心勋章

高级学术勋章

高级信用勋章

特级热心勋章

特级学术勋章

特级信用勋章

本版微信群

扫码加我拉你入群