楼主: Lisrelchen
1610 0

[Case Study]Decision Tree Classification using Java [推广有奖]

  • 0关注
  • 62粉丝

VIP

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
49957 个
通用积分
79.5487
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
  1. */
  2. // scalastyle:off println
  3. package org.apache.spark.examples.ml;
  4. // $example on$
  5. import org.apache.spark.SparkConf;
  6. import org.apache.spark.api.java.JavaSparkContext;
  7. import org.apache.spark.ml.Pipeline;
  8. import org.apache.spark.ml.PipelineModel;
  9. import org.apache.spark.ml.PipelineStage;
  10. import org.apache.spark.ml.classification.DecisionTreeClassifier;
  11. import org.apache.spark.ml.classification.DecisionTreeClassificationModel;
  12. import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator;
  13. import org.apache.spark.ml.feature.*;
  14. import org.apache.spark.sql.DataFrame;
  15. import org.apache.spark.sql.SQLContext;
  16. // $example off$

  17. public class JavaDecisionTreeClassificationExample {
  18.   public static void main(String[] args) {
  19.     SparkConf conf = new SparkConf().setAppName("JavaDecisionTreeClassificationExample");
  20.     JavaSparkContext jsc = new JavaSparkContext(conf);
  21.     SQLContext sqlContext = new SQLContext(jsc);

  22.     // $example on$
  23.     // Load the data stored in LIBSVM format as a DataFrame.
  24.     DataFrame data = sqlContext.read().format("libsvm").load("data/mllib/sample_libsvm_data.txt");

  25.     // Index labels, adding metadata to the label column.
  26.     // Fit on whole dataset to include all labels in index.
  27.     StringIndexerModel labelIndexer = new StringIndexer()
  28.       .setInputCol("label")
  29.       .setOutputCol("indexedLabel")
  30.       .fit(data);

  31.     // Automatically identify categorical features, and index them.
  32.     VectorIndexerModel featureIndexer = new VectorIndexer()
  33.       .setInputCol("features")
  34.       .setOutputCol("indexedFeatures")
  35.       .setMaxCategories(4) // features with > 4 distinct values are treated as continuous
  36.       .fit(data);

  37.     // Split the data into training and test sets (30% held out for testing)
  38.     DataFrame[] splits = data.randomSplit(new double[]{0.7, 0.3});
  39.     DataFrame trainingData = splits[0];
  40.     DataFrame testData = splits[1];

  41.     // Train a DecisionTree model.
  42.     DecisionTreeClassifier dt = new DecisionTreeClassifier()
  43.       .setLabelCol("indexedLabel")
  44.       .setFeaturesCol("indexedFeatures");

  45.     // Convert indexed labels back to original labels.
  46.     IndexToString labelConverter = new IndexToString()
  47.       .setInputCol("prediction")
  48.       .setOutputCol("predictedLabel")
  49.       .setLabels(labelIndexer.labels());

  50.     // Chain indexers and tree in a Pipeline
  51.     Pipeline pipeline = new Pipeline()
  52.       .setStages(new PipelineStage[]{labelIndexer, featureIndexer, dt, labelConverter});

  53.     // Train model.  This also runs the indexers.
  54.     PipelineModel model = pipeline.fit(trainingData);

  55.     // Make predictions.
  56.     DataFrame predictions = model.transform(testData);

  57.     // Select example rows to display.
  58.     predictions.select("predictedLabel", "label", "features").show(5);

  59.     // Select (prediction, true label) and compute test error
  60.     MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
  61.       .setLabelCol("indexedLabel")
  62.       .setPredictionCol("prediction")
  63.       .setMetricName("precision");
  64.     double accuracy = evaluator.evaluate(predictions);
  65.     System.out.println("Test Error = " + (1.0 - accuracy));

  66.     DecisionTreeClassificationModel treeModel =
  67.       (DecisionTreeClassificationModel) (model.stages()[2]);
  68.     System.out.println("Learned classification tree model:\n" + treeModel.toDebugString());
  69.     // $example off$
  70.   }
  71. }
复制代码


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Case study Decision cation study ATION example import Java

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加JingGuanBbs
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-1 04:53