楼主: ReneeBK
786 3

Spark MLContext Programming Guide [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.7537
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2017-5-20 11:25:27 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guide#spark-mlcontext-programming-guide

  1. Create MLContext

  2. All primary classes that a user interacts with are located in the org.apache.sysml.api.mlcontext package. For convenience, we can additionally add a static import of ScriptFactory to shorten the syntax for creating Script objects. An MLContext object can be created by passing its constructor a reference to the SparkSession (spark) or SparkContext (sc). If successful, you should see a “Welcome to Apache SystemML!” message.
复制代码
  1. import org.apache.sysml.api.mlcontext._
  2. import org.apache.sysml.api.mlcontext.ScriptFactory._
  3. val ml = new MLContext(spark)
复制代码


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Programming Context Program Contex Spark

沙发
ReneeBK 发表于 2017-5-20 11:26:04
  1. DataFrame Example

  2. For demonstration purposes, we’ll use Spark to create a DataFrame called df of random doubles from 0 to 1 consisting of 10,000 rows and 100 columns.

  3. import org.apache.spark.sql._
  4. import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
  5. import scala.util.Random
  6. val numRows = 10000
  7. val numCols = 100
  8. val data = sc.parallelize(0 to numRows-1).map { _ => Row.fromSeq(Seq.fill(numCols)(Random.nextDouble)) }
  9. val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, DoubleType, true) } )
  10. val df = spark.createDataFrame(data, schema)
复制代码

藤椅
ReneeBK 发表于 2017-5-20 11:27:14
  1. RDD Example

  2. Let’s take a look at an example of input matrices as RDDs in CSV format. We’ll create two 2x2 matrices and input these into a DML script. This script will sum each matrix and create a message based on which sum is greater. We will output the sums and the message.

  3. For fun, we’ll write the script String to a file and then use ScriptFactory’s dmlFromFile method to create the script object based on the file. We’ll also specify the inputs using a Map, although we could have also chained together two in methods to specify the same inputs.

  4. val rdd1 = sc.parallelize(Array("1.0,2.0", "3.0,4.0"))
  5. val rdd2 = sc.parallelize(Array("5.0,6.0", "7.0,8.0"))
  6. val sums = """
  7. s1 = sum(m1);
  8. s2 = sum(m2);
  9. if (s1 > s2) {
  10.   message = "s1 is greater"
  11. } else if (s2 > s1) {
  12.   message = "s2 is greater"
  13. } else {
  14.   message = "s1 and s2 are equal"
  15. }
  16. """
  17. scala.tools.nsc.io.File("sums.dml").writeAll(sums)
  18. val sumScript = dmlFromFile("sums.dml").in(Map("m1"-> rdd1, "m2"-> rdd2)).out("s1", "s2", "message")
  19. val sumResults = ml.execute(sumScript)
  20. val s1 = sumResults.getDouble("s1")
  21. val s2 = sumResults.getDouble("s2")
  22. val message = sumResults.getString("message")
复制代码

板凳
ReneeBK 发表于 2017-5-20 11:28:18
  1. Matrix Output

  2. Let’s look at an example of reading a matrix out of SystemML. We’ll create a DML script in which we create a 2x2 matrix m. We’ll set the variable n to be the sum of the cells in the matrix.

  3. We create a script object using String s, and we set m and n as the outputs. We execute the script, and in the results we see we have Matrix m and Double n. The n output variable has a value of 110.0.

  4. We get Matrix m and Double n as a Tuple of values x and y. We then convert Matrix m to an RDD of IJV values, an RDD of CSV values, a DataFrame, and a two-dimensional Double Array, and we display the values in each of these data structures.

  5. val s =
  6. """
  7. m = matrix("11 22 33 44", rows=2, cols=2)
  8. n = sum(m)
  9. """
  10. val scr = dml(s).out("m", "n");
  11. val res = ml.execute(scr)
  12. val (x, y) = res.getTuple[Matrix, Double]("m", "n")
  13. x.toRDDStringIJV.collect.foreach(println)
  14. x.toRDDStringCSV.collect.foreach(println)
  15. x.toDF.collect.foreach(println)
  16. x.to2DDoubleArray
复制代码

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-1 17:10