人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › winbugs及其他软件专版 › 【GitHub】R Deep Learning Cookbook

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

发帖

楼主: Reader's

1605 5

【GitHub】R Deep Learning Cookbook [推广有奖]

0关注
0粉丝

博士生

59%

还不是VIP/贵宾

TA的文库 其他...

可解釋的機器學習

Operations Research(运筹学)

国际金融(Finance)

威望: 0 级
论坛币: 41133 个
通用积分: 2.0023
学术水平: 7 点
热心指数: 5 点
信用等级: 5 点
经验: 2201 点
帖子: 198
精华: 1
在线时间: 36 小时
注册时间: 2015-6-1
最后登录: 2024-3-3

楼主

Reader's 发表于 2017-8-13 02:10:48 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

R Deep Learning Cookbook

This is the code repository for R Deep Learning Cookbook, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.

About the Book

Deep Learning is the next big thing. It is a part of machine learning. It's favorable results in applications with huge and complex data is remarkable. Simultaneously, R programming language is very popular amongst the data miners and statisticians.

Instructions and Navigation

All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.

The code will look like the following:

[default] exten => s,1,Dial(Zap/1|30) exten => s,2,Voicemail(u100) exten => s,102,Voicemail(b100) exten => i,1,Voicemail(s0)

A lot of inquisitiveness, perseverance, and passion is required to build a strong background in data science. The scope of deep learning is quite broad; thus, the following backgrounds is required to effectively utilize this cookbook:

Basics of machine learning and data analysis
Proficiency in R programming
Basics of Python and Docker Lastly, you need to appreciate deep learning algorithms and know how they solve complex problems in multiple domains

Related Products

Suggestions and Feedback

Click here if you have any feedback or suggestions.

https://github.com/PacktPublishing/R-Deep-Learning-Cookbook

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：Cookbook Learning earning GitHub Learn

本帖被以下文库推荐

· 编程语言(Coding Languages)|主题: 3936, 订阅: 126

使用道具举报

沙发

Reader's 发表于 2017-8-13 02:19:39 |只看作者 |坛友微信交流群

########### COLLABORATIVE FILTERING WITH RBM
setwd("Set the working directory with movies.dat and ratings.dat files")
## Read movie lens data
txt <- readLines("movies.dat", encoding = "latin1")
txt_split <- lapply(strsplit(txt, "::"), function(x) as.data.frame(t(x), stringsAsFactors=FALSE))
movies_df <- do.call(rbind, txt_split)
names(movies_df) <- c("MovieID", "Title", "Genres")
movies_df$MovieID <- as.numeric(movies_df$MovieID)
movies_df$id_order <- 1:nrow(movies_df)
ratings_df <- read.table("ratings.dat", sep=":",header=FALSE,stringsAsFactors = F)
ratings_df <- ratings_df[,c(1,3,5,7)]
colnames(ratings_df) <- c("UserID","MovieID","Rating","Timestamp")
# Merge user ratings and movies
merged_df <- merge(movies_df, ratings_df, by="MovieID",all=FALSE)
# Remove unnecessary columns
merged_df[,c("Timestamp","Title","Genres")] <- NULL
# create % rating
merged_df$rating_per <- merged_df$Rating/5
# Generate a matrix of ratings
num_of_users <- 1000
num_of_movies <- length(unique(movies_df$MovieID))
trX <- matrix(0,nrow=num_of_users,ncol=num_of_movies)
for(i in 1:num_of_users){
merged_df_user <- merged_df[merged_df$UserID %in% i,]
trX[i,merged_df_user$id_order] <- merged_df_user$rating_per
}
# Import tenforflow libraries
# Sys.setenv(TENSORFLOW_PYTHON="C:/PROGRA~1/Python35/python.exe")
# Sys.setenv(TENSORFLOW_PYTHON_VERSION = 3)
library(tensorflow)
np <- import("numpy")
# Create TensorFlow session
# Reset the graph
tf$reset_default_graph()
# Starting session as interactive session
sess <- tf$InteractiveSession()
# Model Parameters
num_hidden = 20
num_input = nrow(movies_df)
vb <- tf$placeholder(tf$float32, shape = shape(num_input)) #Number of unique movies
hb <- tf$placeholder(tf$float32, shape = shape(num_hidden)) #Number of features we're going to learn
W <- tf$placeholder(tf$float32, shape = shape(num_input, num_hidden))
#Phase 1: Input Processing
v0 = tf$placeholder(tf$float32,shape= shape(NULL, num_input))
prob_h0= tf$nn$sigmoid(tf$matmul(v0, W) + hb)
h0 = tf$nn$relu(tf$sign(prob_h0 - tf$random_uniform(tf$shape(prob_h0))))
#Phase 2: Reconstruction
prob_v1 = tf$nn$sigmoid(tf$matmul(h0, tf$transpose(W)) + vb)
v1 = tf$nn$relu(tf$sign(prob_v1 - tf$random_uniform(tf$shape(prob_v1))))
h1 = tf$nn$sigmoid(tf$matmul(v1, W) + hb)
# RBM Parameters and functions
#Learning rate
alpha = 1.0
#Create the gradients
w_pos_grad = tf$matmul(tf$transpose(v0), h0)
w_neg_grad = tf$matmul(tf$transpose(v1), h1)
#Calculate the Contrastive Divergence to maximize
CD = (w_pos_grad - w_neg_grad) / tf$to_float(tf$shape(v0)[1])
#Create methods to update the weights and biases
update_w = W + alpha * CD
update_vb = vb + alpha * tf$reduce_mean(v0 - v1)
update_hb = hb + alpha * tf$reduce_mean(h0 - h1)
# Mean Absolute Error Function.
err = v0 - v1
err_sum = tf$reduce_mean(err * err)
# Initialise variables (current and previous)
cur_w = tf$Variable(tf$zeros(shape = shape(num_input, num_hidden), dtype=tf$float32))
cur_vb = tf$Variable(tf$zeros(shape = shape(num_input), dtype=tf$float32))
cur_hb = tf$Variable(tf$zeros(shape = shape(num_hidden), dtype=tf$float32))
prv_w = tf$Variable(tf$random_normal(shape=shape(num_input, num_hidden), stddev=0.01, dtype=tf$float32))
prv_vb = tf$Variable(tf$zeros(shape = shape(num_input), dtype=tf$float32))
prv_hb = tf$Variable(tf$zeros(shape = shape(num_hidden), dtype=tf$float32))
# Start tensorflow session
sess$run(tf$global_variables_initializer())
output <- sess$run(list(update_w, update_vb, update_hb), feed_dict = dict(v0=trX,
W = prv_w$eval(),
vb = prv_vb$eval(),
hb = prv_hb$eval()))
prv_w <- output[[1]]
prv_vb <- output[[2]]
prv_hb <- output[[3]]
sess$run(err_sum, feed_dict=dict(v0=trX, W= prv_w, vb= prv_vb, hb= prv_hb))
# Train RBM
epochs= 500
errors <- list()
weights <- list()
for(ep in 1:epochs){
for(i in seq(0,(dim(trX)[1]-100),100)){
batchX <- trX[(i+1):(i+100),]
output <- sess$run(list(update_w, update_vb, update_hb), feed_dict = dict(v0=batchX,
W = prv_w,
vb = prv_vb,
hb = prv_hb))
prv_w <- output[[1]]
prv_vb <- output[[2]]
prv_hb <- output[[3]]
if(i%%1000 == 0){
errors <- c(errors,sess$run(err_sum, feed_dict=dict(v0=batchX, W= prv_w, vb= prv_vb, hb= prv_hb)))
weights <- c(weights,output[[1]])
cat(i , " : ")
}
}
cat("epoch :", ep, " : reconstruction error : ", errors[length(errors)][[1]],"\n")
}
# Plot reconstruction error
error_vec <- unlist(errors)
plot(error_vec,xlab="# of batches",ylab="mean squared reconstruction error",main="RBM-Reconstruction MSE plot")
# Recommendation
#Selecting the input user
inputUser = as.matrix(t(trX[75,]))
names(inputUser) <- movies_df$id_order
# Remove the movies not watched yet
inputUser <- inputUser[inputUser>0]
# Plot the top genre movies
top_rated_movies <- movies_df[as.numeric(names(inputUser)[order(inputUser,decreasing = TRUE)]),]$Title
top_rated_genres <- movies_df[as.numeric(names(inputUser)[order(inputUser,decreasing = TRUE)]),]$Genres
top_rated_genres <- as.data.frame(top_rated_genres,stringsAsFactors=F)
top_rated_genres$count <- 1
top_rated_genres <- aggregate(count~top_rated_genres,FUN=sum,data=top_rated_genres)
top_rated_genres <- top_rated_genres[with(top_rated_genres, order(-count)), ]
top_rated_genres$top_rated_genres <- factor(top_rated_genres$top_rated_genres, levels = top_rated_genres$top_rated_genres)
ggplot(top_rated_genres[top_rated_genres$count>1,],aes(x=top_rated_genres,y=count))+
geom_bar(stat="identity")+
theme_bw()+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
labs(x="Genres",y="count",title="Top Rated Genres")+
theme(plot.title = element_text(hjust = 0.5))
#Feeding in the user and reconstructing the input
hh0 = tf$nn$sigmoid(tf$matmul(v0, W) + hb)
vv1 = tf$nn$sigmoid(tf$matmul(hh0, tf$transpose(W)) + vb)
feed = sess$run(hh0, feed_dict=dict( v0= inputUser, W= prv_w, hb= prv_hb))
rec = sess$run(vv1, feed_dict=dict( hh0= feed, W= prv_w, vb= prv_vb))
names(rec) <- movies_df$id_order
# Select all recommended movies
top_recom_movies <- movies_df[as.numeric(names(rec)[order(rec,decreasing = TRUE)]),]$Title[1:10]
top_recom_genres <- movies_df[as.numeric(names(rec)[order(rec,decreasing = TRUE)]),]$Genres
top_recom_genres <- as.data.frame(top_recom_genres,stringsAsFactors=F)
top_recom_genres$count <- 1
top_recom_genres <- aggregate(count~top_recom_genres,FUN=sum,data=top_recom_genres)
top_recom_genres <- top_recom_genres[with(top_recom_genres, order(-count)), ]
top_recom_genres$top_recom_genres <- factor(top_recom_genres$top_recom_genres, levels = top_recom_genres$top_recom_genres)
ggplot(top_recom_genres[top_recom_genres$count>20,],aes(x=top_recom_genres,y=count))+
geom_bar(stat="identity")+
theme_bw()+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
labs(x="Genres",y="count",title="Top Recommended Genres")+
theme(plot.title = element_text(hjust = 0.5))

复制代码

使用道具举报

藤椅

Reader's 发表于 2017-8-13 02:20:41 |只看作者 |坛友微信交流群

###### DEEP BELIEF NETWORKS
# Import tenforflow libraries
# Sys.setenv(TENSORFLOW_PYTHON="C:/PROGRA~1/Python35/python.exe")
# Sys.setenv(TENSORFLOW_PYTHON_VERSION = 3)
library(tensorflow)
np <- import("numpy")
# Create TensorFlow session
# Reset the graph
tf$reset_default_graph()
# Starting session as interactive session
sess <- tf$InteractiveSession()
# Input data (MNIST)
mnist <- tf$examples$tutorials$mnist$input_data$read_data_sets("MNIST-data/",one_hot=TRUE)
trainX <- mnist$train$images
trainY <- mnist$train$labels
testX <- mnist$test$images
testY <- mnist$test$labels
# Creating DBN
RBM_hidden_sizes = c(900, 500 , 300 )
# Function to initialize RBM
RBM <- function(input_data,
num_input,
num_output,
epochs = 5,
alpha = 0.1,
batchsize=100){
# Placeholder variables
vb <- tf$placeholder(tf$float32, shape = shape(num_input))
hb <- tf$placeholder(tf$float32, shape = shape(num_output))
W <- tf$placeholder(tf$float32, shape = shape(num_input, num_output))
# Phase 1 : Forward Pass
X = tf$placeholder(tf$float32, shape=shape(NULL, num_input))
prob_h0= tf$nn$sigmoid(tf$matmul(X, W) + hb) #probabilities of the hidden units
h0 = tf$nn$relu(tf$sign(prob_h0 - tf$random_uniform(tf$shape(prob_h0)))) #sample_h_given_X
# Phase 2 : Backward Pass
prob_v1 = tf$nn$sigmoid(tf$matmul(h0, tf$transpose(W)) + vb)
v1 = tf$nn$relu(tf$sign(prob_v1 - tf$random_uniform(tf$shape(prob_v1))))
h1 = tf$nn$sigmoid(tf$matmul(v1, W) + hb)
# Calculate gradients
w_pos_grad = tf$matmul(tf$transpose(X), h0)
w_neg_grad = tf$matmul(tf$transpose(v1), h1)
CD = (w_pos_grad - w_neg_grad) / tf$to_float(tf$shape(X)[0])
update_w = W + alpha * CD
update_vb = vb + alpha * tf$reduce_mean(X - v1)
update_hb = hb + alpha * tf$reduce_mean(h0 - h1)
# Objective function
err = tf$reduce_mean(tf$square(X - v1))
# Initialise variables
cur_w = tf$Variable(tf$zeros(shape = shape(num_input, num_output), dtype=tf$float32))
cur_vb = tf$Variable(tf$zeros(shape = shape(num_input), dtype=tf$float32))
cur_hb = tf$Variable(tf$zeros(shape = shape(num_output), dtype=tf$float32))
prv_w = tf$Variable(tf$random_normal(shape=shape(num_input, num_output), stddev=0.01, dtype=tf$float32))
prv_vb = tf$Variable(tf$zeros(shape = shape(num_input), dtype=tf$float32))
prv_hb = tf$Variable(tf$zeros(shape = shape(num_output), dtype=tf$float32))
# Start tensorflow session
sess$run(tf$global_variables_initializer())
output <- sess$run(list(update_w, update_vb, update_hb), feed_dict = dict(X=input_data,
W = prv_w$eval(),
vb = prv_vb$eval(),
hb = prv_hb$eval()))
prv_w <- output[[1]]
prv_vb <- output[[2]]
prv_hb <- output[[3]]
sess$run(err, feed_dict=dict(X= input_data, W= prv_w, vb= prv_vb, hb= prv_hb))
errors <- list()
weights <- list()
u=1
for(ep in 1:epochs){
for(i in seq(0,(dim(input_data)[1]-batchsize),batchsize)){
batchX <- input_data[(i+1):(i+batchsize),]
output <- sess$run(list(update_w, update_vb, update_hb), feed_dict = dict(X=batchX,
W = prv_w,
vb = prv_vb,
hb = prv_hb))
prv_w <- output[[1]]
prv_vb <- output[[2]]
prv_hb <- output[[3]]
if(i%%10000 == 0){
errors[[u]] <- sess$run(err, feed_dict=dict(X= batchX, W= prv_w, vb= prv_vb, hb= prv_hb))
weights[[u]] <- output[[1]]
u=u+1
cat(i , " : ")
}
}
cat("epoch :", ep, " : reconstruction error : ", errors[length(errors)][[1]],"\n")
}
w <- prv_w
vb <- prv_vb
hb <- prv_hb
# Get the output
input_X = tf$constant(input_data)
ph_w = tf$constant(w)
ph_hb = tf$constant(hb)
out = tf$nn$sigmoid(tf$matmul(input_X, ph_w) + ph_hb)
sess$run(tf$global_variables_initializer())
return(list(output_data = sess$run(out),
error_list=errors,
weight_list=weights,
weight_final=w,
bias_final=hb))
}
#Since we are training, set input as training data
inpX = trainX
#Size of inputs is the number of inputs in the training set
num_input = ncol(inpX)
#Train RBM
RBM_output <- list()
for(i in 1:length(RBM_hidden_sizes)){
size <- RBM_hidden_sizes[i]
# Train the RBM
RBM_output[[i]] <- RBM(input_data=inpX,
num_input=num_input,
num_output=size,
epochs = 5,
alpha = 0.1,
batchsize=100)
# Update the input data
inpX <- RBM_output[[i]]$output_data
# Update the input_size
num_input = size
cat("completed size :", size,"\n")
}
# Plot reconstruction error
error_df <- data.frame("error"=c(unlist(RBM_output[[1]]$error_list),unlist(RBM_output[[2]]$error_list),unlist(RBM_output[[3]]$error_list)),
"batches"=c(rep(seq(1:length(unlist(RBM_output[[1]]$error_list))),times=3)),
"hidden_layer"=c(rep(c(1,2,3),each=length(unlist(RBM_output[[1]]$error_list)))),
stringsAsFactors = FALSE)
plot(error ~ batches,
xlab = "# of batches",
ylab = "Reconstruction Error",
pch = c(1, 7, 16)[hidden_layer],
main = "Stacked RBM-Reconstruction MSE plot",
data = error_df)
legend('topright',
c("H1_900","H2_500","H3_300"),
pch = c(1, 7, 16))

复制代码

使用道具举报

板凳

Reader's 发表于 2017-8-13 02:23:39 |只看作者 |坛友微信交流群

Sys.setenv(TENSORFLOW_PYTHON="C:/PROGRA~3/ANACON~1/python.exe")
Sys.setenv(TENSORFLOW_PYTHON_VERSION = 3)
library(tensorflow)
require(imager)
require(caret)
# Load mnist dataset from tensorflow library
datasets <- tf$contrib$learn$datasets
mnist <- datasets$mnist$read_data_sets("MNIST-data", one_hot = TRUE)
# Function to plot MNIST dataset
plot_mnist<-function(imageD, pixel.y=16){
require(imager)
actImage<-matrix(imageD, ncol=pixel.y, byrow=FALSE)
img.col.mat <- imappend(list(as.cimg(actImage)), "c")
plot(img.col.mat, axes=F)
}
# Reduce Image Size
reduceImage<-function(actds, n.pixel.x=16, n.pixel.y=16){
actImage<-matrix(actds, ncol=28, byrow=FALSE)
img.col.mat <- imappend(list(as.cimg(actImage)),"c")
thmb <- resize(img.col.mat, n.pixel.x, n.pixel.y)
outputImage<-matrix(thmb[,,1,1], nrow = 1, byrow = F)
return(outputImage)
}
# Covert train data to 16 x 16 pixel image
trainData<-t(apply(mnist$train$images, 1, FUN=reduceImage))
validData<-t(apply(mnist$test$images, 1, FUN=reduceImage))
labels <- mnist$train$labels
labels_valid <- mnist$test$labels
rm(mnist)
# Reset the graph and set-up a interactive session
tf$reset_default_graph()
sess<-tf$InteractiveSession()
# Define Model parameter
n_input<-16
step_size<-16
n.hidden<-64
n.class<-10
# Define training parameter
lr<-0.01
batch<-500
iteration = 100
# Set up a most basic RNN
rnn<-function(x, weight, bias){
# Unstack input into step_size
x = tf$unstack(x, step_size, 1)
# Define a most basic RNN
rnn_cell = tf$contrib$rnn$BasicRNNCell(n.hidden)
# create a recurrent neural network
cell_output = tf$contrib$rnn$static_rnn(rnn_cell, x, dtype=tf$float32)
# Linear activation, using rnn inner loop
last_vec=tail(cell_output[[1]], n=1)[[1]]
return(tf$matmul(last_vec, weights) + bias)
}
# Function to evaluate mean accuracy
eval_acc<-function(yhat, y){
# Count correct solution
correct_Count = tf$equal(tf$argmax(yhat,1L), tf$argmax(y,1L))
# Mean accuracy
mean_accuracy = tf$reduce_mean(tf$cast(correct_Count, tf$float32))
return(mean_accuracy)
}
with(tf$name_scope('input'), {
# Define placeholder for input data
x = tf$placeholder(tf$float32, shape=shape(NULL, step_size, n_input), name='x')
y <- tf$placeholder(tf$float32, shape(NULL, n.class), name='y')
# Define Weights and bias
weights <- tf$Variable(tf$random_normal(shape(n.hidden, n.class)))
bias <- tf$Variable(tf$random_normal(shape(n.class)))
})
# Evaluate rnn cell output
yhat = rnn(x, weights, bias)
# Define loss and optimizer
cost = tf$reduce_mean(tf$nn$softmax_cross_entropy_with_logits(logits=yhat, labels=y))
optimizer = tf$train$AdamOptimizer(learning_rate=lr)$minimize(cost)
# Run optimization
sess$run(tf$global_variables_initializer())
# Running optimization
for(i in 1:iteration){
spls <- sample(1:dim(trainData)[1],batch)
sample_data<-trainData[spls,]
sample_y<-labels[spls,]
# Reshape sample into 16 sequence with each of 16 element
sample_data=tf$reshape(sample_data, shape(batch, step_size, n_input))
out<-optimizer$run(feed_dict = dict(x=sample_data$eval(), y=sample_y))
if (i %% 1 == 0){
cat("iteration - ", i, "Training Loss - ", cost$eval(feed_dict = dict(x=sample_data$eval(), y=sample_y)), "\n")
}
}
# Calculate accuracy for 128 mnist test images
accuracy<-eval_acc(yhat, y)
valid_data=tf$reshape(validData, shape(-1, step_size, n_input))
yhat<-sess$run(tf$argmax(yhat, 1L), feed_dict = dict(x = valid_data$eval()))
image(t(matrix(validData[20,], ncol = 16, nrow = 16, byrow = T)), col = gray((0:32)/32))
image(t(matrix(trainData[20,], ncol = 16, nrow = 16, byrow = T)), col = gray((0:32)/32))
cost$eval(feed_dict=dict(x=valid_data$eval(), y=labels_valid))