发帖

楼主: ReneeBK

1740 0

[Case Study]Gradient Boosted Tree Classification using Python [推广有奖]

1关注
62粉丝

VIP

已卖：4901份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库 其他...

R资源总汇

Panel Data Analysis

Experimental Design

0%

威望: 1 级
论坛币: 49675 个
通用积分: 56.3087
学术水平: 370 点
热心指数: 273 点
信用等级: 335 点
经验: 57805 点
帖子: 4005
精华: 21
在线时间: 582 小时
注册时间: 2005-5-8
最后登录: 2023-11-26

楼主

ReneeBK 发表于 2015-11-16 06:59:06 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

from __future__ import print_function
import sys
from pyspark import SparkContext
from pyspark.ml.classification import GBTClassifier
from pyspark.ml.feature import StringIndexer
from pyspark.ml.regression import GBTRegressor
from pyspark.mllib.evaluation import BinaryClassificationMetrics, RegressionMetrics
from pyspark.sql import Row, SQLContext
"""
A simple example demonstrating a Gradient Boosted Trees Classification/Regression Pipeline.
Note: GBTClassifier only supports binary classification currently
Run with:
bin/spark-submit examples/src/main/python/ml/gradient_boosted_trees.py
"""
def testClassification(train, test):
# Train a GradientBoostedTrees model.
rf = GBTClassifier(maxIter=30, maxDepth=4, labelCol="indexedLabel")
model = rf.fit(train)
predictionAndLabels = model.transform(test).select("prediction", "indexedLabel") \
.map(lambda x: (x.prediction, x.indexedLabel))
metrics = BinaryClassificationMetrics(predictionAndLabels)
print("AUC %.3f" % metrics.areaUnderROC)
def testRegression(train, test):
# Train a GradientBoostedTrees model.
rf = GBTRegressor(maxIter=30, maxDepth=4, labelCol="indexedLabel")
model = rf.fit(train)
predictionAndLabels = model.transform(test).select("prediction", "indexedLabel") \
.map(lambda x: (x.prediction, x.indexedLabel))
metrics = RegressionMetrics(predictionAndLabels)
print("rmse %.3f" % metrics.rootMeanSquaredError)
print("r2 %.3f" % metrics.r2)
print("mae %.3f" % metrics.meanAbsoluteError)
if __name__ == "__main__":
if len(sys.argv) > 1:
print("Usage: gradient_boosted_trees", file=sys.stderr)
exit(1)
sc = SparkContext(appName="PythonGBTExample")
sqlContext = SQLContext(sc)
# Load the data stored in LIBSVM format as a DataFrame.
df = sqlContext.read.format("libsvm").load("data/mllib/sample_libsvm_data.txt")
# Map labels into an indexed column of labels in [0, numLabels)
stringIndexer = StringIndexer(inputCol="label", outputCol="indexedLabel")
si_model = stringIndexer.fit(df)
td = si_model.transform(df)
[train, test] = td.randomSplit([0.7, 0.3])
testClassification(train, test)
testRegression(train, test)
sc.stop()

复制代码

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：Case study gradient Boosted python cation example simple future import

[Case Study]Gradient Boosted Tree Classification using Python [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

[Case Study]Gradient Boosted Tree Classification using Python [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群