发帖

楼主: 安然2016

2476 1

Python和R代码机器学习算法速查对比表 [推广有奖]

0关注
13粉丝

讲师

7%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 50136 个
通用积分: 1.0002
学术水平: 4 点
热心指数: 12 点
信用等级: 5 点
经验: 5550 点
帖子: 252
精华: 0
在线时间: 111 小时
注册时间: 2016-2-16
最后登录: 2017-5-25

楼主

安然2016 发表于 2016-2-17 11:10:59 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

大数据文摘作品

翻译：丁雪
校对：王方思

在拿破仑·希尔（Napolean Hill）所著的《思考致富》（Think and Grow Rich）一书中，他为我们引述了Darby苦挖金矿多年后，就在离矿脉一步之遥的时候与宝藏失之交臂的故事。

思考致富中文版的豆瓣阅读链接：

[url=]http://read.**.com/reader/ebook/10954762/[/url]

根据该书内容进行的修改

如今，我虽然不知道这故事是真是假，但是我明确知道在我身边有不少这样的“数据Darby”。这些人了解机器学习的目的和执行，对待任何研究问题只使用2-3种算法。他们不用更好的算法和技术来更新自身，只因为他们太顽固，或者他们只是在耗费时间而不求进步。

像Darby这一类人，他们总是在接近终点的时候而错失良机。最终，他们以计算量大、难度大或是无法设定合适的阈值来优化模型等借口，放弃了机器学习。这有什么意义？你听说过这些人吗？

今天给出的速查表旨在改变这群“数据Darby”对机器学习的态度，使他们成为身体力行的倡导者。这里收集了10个最为常用的机器学习算法，附上了Python和R代码。

考虑到机器学习方法在建模中得到了更多的运用，以下速查表可以作为代码指南来帮助你掌握机器学习算法运用。祝你好运！

对于那些超级懒惰的数据Darbies，我们将让你的生活过得更轻松。你可以在此下载PDF版的速查表，便可直接复制粘贴代码。

机器学习算法
类型
监督学习	非监督学习	增强学习
决策树 K-近邻算法随机决策森林 Logistics回归分析	Apriori算法 K-均值算法系统聚类	马尔科夫决策过程增强学习算法（Q-学习）

未完，剩下的内容见跟帖。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏4 回帖

关键词：python 机器学习算法学习算法机器学习对比表思考致富学习方法拿破仑挖金矿中文版

Python和R代码机器学习算法速查对比表.txt
下载链接: https://bbs.pinggu.org/a-1974751.html

1.08 KB

相关帖子

沙发

安然2016 发表于 2016-2-17 11:12:33

640_wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1.webp.jpg

线性回归

#Import Library

#Import other necessary libraries like pandas,

#numpy...

from sklearn import linear_model

#Load Train and Test datasets

#Identify feature and response variable(s) and

#values must be numeric and numpy arrays

x_train=input_variables_values_training_datasets y_train=target_variables_values_training_datasets x_test=input_variables_values_test_datasets

#Create linear regression objectlinear = linear_model.LinearRegression()

#Train the model using the training sets and #check scorelinear.fit(x_train, y_train) linear.score(x_train, y_train)

#Equation coefficient and Intercept print('Coefficient: \n', linear.coef_) print('Intercept: \n', linear.intercept_) #Predict Output

predicted= linear.predict(x_test)

#Load Train and Test datasets

#Identify feature and response variable(s) and

#values must be numeric and numpy arrays

x_train <- input_variables_values_training_datasets

y_train <- target_variables_values_training_datasets

x_test <- input_variables_values_test_datasets

x <- cbind(x_train,y_train)

#Train the model using the training sets and

#check score

linear <- lm(y_train ~ ., data = x)summary(linear)

#Predict Output

predicted= predict(linear,x_test)

逻辑回归

#Import Library

from sklearn.linear_model import LogisticRegression

#Assumed you have, X (predictor) and Y (target)

#for training data set and x_test(predictor)

#of test_dataset

#Create logistic regression object

model = LogisticRegression()

#Train the model using the training sets

#and check score

model.fit(X, y)

model.score(X, y)

#Equation coefficient and Intercept

print('Coefficient: \n', model.coef_)

print('Intercept: \n', model.intercept_)

#Predict Output

predicted= model.predict(x_test)

x <- cbind(x_train,y_train)

#Train the model using the training sets and check #score

logistic <- glm(y_train ~ ., data = x,family='binomial') summary(logistic)

#Predict Outputpredicted= predict(logistic,x_test)

决

策

树

#Import Library

#Import other necessary libraries like pandas, numpy... from sklearn import tree

#Assumed you have, X (predictor) and Y (target) for

#training data set and x_test(predictor) of #test_dataset

#Create tree objectmodel = tree.DecisionTreeClassifier(criterion='gini') #for classification, here you can change the #algorithm as gini or entropy (information gain) by

#default it is gin

#model = tree.DecisionTreeRegressor() for

#regression

#Train the model using the training sets and check #score

model.fit(X, y)

model.score(X, y)

#Predict Outputpredicted= model.predict(x_test)

#Import Library

library(rpart)

x <-cbind(x_train,y_train)

#grow tree

fit <- rpart(y_train ~ ., data = x,method="class") summary(fit)

#Predict Outputpredicted= predict(fit,x_test)

支持

向量机

#Import Library

from sklearn import svm

#Assumed you have, X (predictor) and Y (target) for #training data set and x_test(predictor) of test_dataset

#Create SVM classification objectmodel = svm.svc()

#there are various options associatedwith it, this is simple for classification.

#Train the model using the training sets and check #score

model.fit(X, y)

model.score(X, y)

#Predict Outputpredicted= model.predict(x_test)

#Import Library

library(e1071)

x <- cbind(x_train,y_train) #Fitting model

fit <-svm(y_train ~ ., data = x) summary(fit)

#Predict Outputpredicted= predict(fit,x_test)

贝叶斯算法

#Import Libraryfrom sklearn.naive_bayes import GaussianNB

#Assumed you have, X (predictor) and Y (target) for

#training data set and x_test(predictor) of test_dataset

#Create SVM classification object model = GaussianNB()

#there is other distribution for multinomial classes like Bernoulli Naive Bayes

#Train the model using the training sets and check

#scoremodel.fit(X, y)

#Predict Outputpredicted= model.predict(x_test)

#Import Librarylibrary(e1071)

x <- cbind(x_train,y_train)#Fitting model

fit <-naiveBayes(y_train ~ ., data = x) summary(fit)

#Predict Outputpredicted= predict(fit,x_test)

k-近邻算法析

#Import Library

from sklearn.neighbors import KNeighborsClassifier

#Assumed you have, X (predictor) and Y (target) for

#training data set and x_test(predictor) of test_dataset

#Create KNeighbors classifier object model KNeighborsClassifier(n_neighbors=6)

#default value for n_neighbors is 5

#Train the model using the training sets and check score model.fit(X, y)

#Predict Outputpredicted= model.predict(x_test)

#Import Librarylibrary(knn)

x <- cbind(x_train,y_train)

#Fitting model

fit <-knn(y_train ~ ., data = x,k=5) summary(fit)

#Predict Output

predicted= predict(fit,x_test)

硬聚类算法

#Import Library

from sklearn.cluster import KMeans

#Assumed you have, X (attributes) for training data set

#and x_test(attributes) of test_dataset

#Create KNeighbors classifier object model

k_means = KMeans(n_clusters=3, random_state=0)

#Train the model using the training sets and check score model.fit(X)

#Predict Outputpredicted= model.predict(x_test)

#Import Library

library(cluster)

fit <- kmeans(X, 3)

#5 cluster solution

随机森林算法

#Import Libraryfrom sklearn.ensemble import RandomForestClassifier

#Assumed you have, X (predictor) and Y (target) for

#training data set and x_test(predictor) of test_dataset

#Create Random Forest objectmodel= RandomForestClassifier()

#Train the model using the training sets and check score model.fit(X, y)

#Predict Outputpredicted= model.predict(x_test)

#Import Library

library(randomForest)

x <- cbind(x_train,y_train)

#Fitting model

fit <- randomForest(Species ~ ., x,ntree=500) summary(fit)

#Predict Outputpredicted= predict(fit,x_test)

降维算法

#Import Library

from sklearn import decomposition

#Assumed you have training and test data set as train and

#test

#Create PCA object pca= decomposition.PCA(n_components=k) #default value of k =min(n_sample, n_features)

#For Factor analysis

#fa= decomposition.FactorAnalysis()

#Reduced the dimension of training dataset using PCA train_reduced = pca.fit_transform(train)

#Reduced the dimension of test datasettest_reduced = pca.transform(test)

#Import Library

library(stats)

pca <- princomp(train, cor = TRUE)

train_reduced <- predict(pca,train)

test_reduced <- predict(pca,test)

GB

D

T

#Import Library

from sklearn.ensemble import GradientBoostingClassifier

#Assumed you have, X (predictor) and Y (target) for

#training data set and x_test(predictor) of test_dataset

#Create Gradient Boosting Classifier object

model= GradientBoostingClassifier(n_estimators=100, \ learning_rate=1.0, max_depth=1, random_state=0)

#Train the model using the training sets and check score model.fit(X, y)

#Predict Output

predicted= model.predict(x_test)

#Import Library

library(caret)

x <- cbind(x_train,y_train)

#Fitting modelfitControl <- trainControl( method = "repeatedcv", + number = 4, repeats = 4)

fit <- train(y ~ ., data = x, method = "gbm",+ trControl = fitControl,verbose = FALSE)

predicted= predict(fit,x_test,type= "prob")[,2]

大数据文摘编译者简介

返回列表

发帖

Python和R代码机器学习算法速查对比表 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

Python和R代码机器学习算法速查对比表 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群