签到
- 苹果/安卓/wp
- 苹果/安卓/wp
客户端
0.0

0.00

人大经济论坛 › 论坛 › 数据科学与人工智能 › IT基础 › Scala及其他JVM语言 › Mike de Waard:Machine Learning for Developers using ...

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

提升主题| 本版置顶| 关闭主题| 变更主题颜色| 抢沙发| 顶贴| 显身卡| 道具中心

楼主: ReneeBK

1741 14

Mike de Waard:Machine Learning for Developers using Scala [推广有奖]

1关注
62粉丝

学术权威

14%

还不是VIP/贵宾

-

TA的文库 其他...

Panel Data Analysis

Experimental Design

0%

威望: 1 级
论坛币: 49517 个
通用积分: 53.5804
学术水平: 370 点
热心指数: 273 点
信用等级: 335 点
经验: 57815 点
帖子: 4006
精华: 21
在线时间: 582 小时
注册时间: 2005-5-8
最后登录: 2023-11-26

楼主

ReneeBK 发表于 2016-4-21 09:42:49 |只看作者 |坛友微信交流群|倒序 |AI写论文

相似文件

换一批

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

本帖隐藏的内容

Machine Learning for Developers Mike de Waard.pdf (14.29 MB, 需要: 1 个论坛币)

二维码

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：Developers developer Learning earning machine

相关帖子

已有 1 人评分	经验	收起理由
william9225	+ 60	精彩帖子

总评分: 经验 + 60 查看全部评分

本帖被以下文库推荐

· Scala资源总汇|主题: 156, 订阅: 10

回复

使用道具举报

沙发

ReneeBK 发表于 2016-4-21 09:43:48 |只看作者 |坛友微信交流群

object KNNExample {
def main(args: Array[String]): Unit = {
val basePath = "/.../KNN_Example_1.csv"
val testData = getDataFromCSV(new File(basePath))
}
def getDataFromCSV(file: File): (Array[Array[Double]], Array[Int]) = {
val source = scala.io.Source.fromFile(file)
val data = source
.getLines()
.drop(1)
.map(x => getDataFromString(x))
.toArray
source.close()
val dataPoints = data.map(x => x._1)
val classifierArray = data.map(x => x._2)
return (dataPoints, classifierArray)
}
def getDataFromString(dataString: String): (Array[Double], Int) = {
//Split the comma separated value string into an array of strings
val dataArray: Array[String] = dataString.split(',')
//Extract the values from the strings
val xCoordinate: Double = dataArray(0).toDouble
val yCoordinate: Double = dataArray(1).toDouble
val classifier: Int = dataArray(2).toInt
//And return the result in a format that can later
//easily be used to feed to Smile
return (Array(xCoordinate, yCoordinate), classifier)
}
}

复制代码

回复

使用道具举报

藤椅

ReneeBK 发表于 2016-4-21 09:44:41 |只看作者 |坛友微信交流群

object KNNExample extends SimpleSwingApplication {
def top = new MainFrame {
title = "KNN Example"
val basePath = "/.../KNN_Example_1.csv"
val testData = getDataFromCSV(new File(basePath))
val plot = ScatterPlot.plot(testData._1,
testData._2,
'@',
Array(Color.red, Color.blue)
)
peer.setContentPane(plot)
size = new Dimension(400, 400)
}
...

复制代码

回复

使用道具举报

板凳

ReneeBK 发表于 2016-4-21 09:45:30 |只看作者 |坛友微信交流群

def main(args: Array[String]): Unit = {
val basePath = "/.../KNN_Example_1.csv"
val testData = getDataFromCSV(new File(basePath))
//Define the amount of rounds, in our case 2 and
//initialise the cross validation
val cv = new CrossValidation(testData._2.length, validationRounds)
val testDataWithIndices = (testData
._1
.zipWithIndex,
testData
._2
.zipWithIndex)
val trainingDPSets = cv.train
.map(indexList => indexList
.map(index => testDataWithIndices
._1.collectFirst { case (dp, `index`) => dp}.get))
val trainingClassifierSets = cv.train
.map(indexList => indexList
.map(index => testDataWithIndices
._2.collectFirst { case (dp, `index`) => dp}.get))
val testingDPSets = cv.test
.map(indexList => indexList
.map(index => testDataWithIndices
._1.collectFirst { case (dp, `index`) => dp}.get))
val testingClassifierSets = cv.test
.map(indexList => indexList
.map(index => testDataWithIndices
._2.collectFirst { case (dp, `index`) => dp}.get))
val validationRoundRecords = trainingDPSets
.zipWithIndex.map(x => ( x._1,
trainingClassifierSets(x._2),
testingDPSets(x._2),
testingClassifierSets(x._2)
)
)
validationRoundRecords
.foreach { record =>
val knn = KNN.learn(record._1, record._2, 3)
//And for each test data point make a prediction with the model
val predictions = record
._3
.map(x => knn.predict(x))
.zipWithIndex
//Finally evaluate the predictions as correct or incorrect
//and count the amount of wrongly classified data points.
val error = predictions
.map(x => if (x._1 != record._4(x._2)) 1 else 0)
.sum
println("False prediction rate: " + error / predictions.length * 100 + "%")
}
}

复制代码

回复

使用道具举报

报纸

ReneeBK 发表于 2016-4-21 09:46:18 |只看作者 |坛友微信交流群

val knn = KNN.learn(record._1, record._2, 3)
val unknownDataPoint = Array(5.3, 4.3)
val result = knn.predict(unknownDatapoint)
if (result == 0)
{
println("Internet Service Provider Alpha")
}
else if (result == 1)
{
println("Internet Service Provider Beta")
}
else
{
println("Unexpected prediction")
}

复制代码

回复

使用道具举报

地板

ReneeBK 发表于 2016-4-21 09:47:26 |只看作者 |坛友微信交流群

def getMessage(file : File) : String =
{
//Note that the encoding of the example files is latin1,
// thus this should be passed to the fromFile method.
val source = scala.io.Source.fromFile(file)("latin1")
val lines = source.getLines mkString "\n"
source.close()
//Find the first line break in the email,
//as this indicates the message body
val firstLineBreak = lines.indexOf("\n\n")
//Return the message body filtered by only text from a-z and to lower case
return lines
.substring(firstLineBreak)
.replace("\n"," ")
.replaceAll("[^a-zA-Z ]","")
.toLowerCase()
}

复制代码

回复

使用道具举报

7楼

ReneeBK 发表于 2016-4-21 09:48:05 |只看作者 |坛友微信交流群

def getFilesFromDir(path: String):List[File] = {
val d = new File(path)
if (d.exists && d.isDirectory) {
//Remove the mac os basic storage file,
//and alternatively for unix systems "cmds"
d .listFiles
.filter(x => x .isFile &&
!x .toString
.contains(".DS_Store") &&
!x .toString
.contains("cmds"))
.toList
}
else {
List[File]()
}
}

复制代码

回复

使用道具举报

8楼

ReneeBK 发表于 2016-4-21 09:48:48 |只看作者 |坛友微信交流群

def main(args: Array[String]): Unit = {
val basePath = "/Users/../Downloads/data"
val spamPath = basePath + "/spam"
val spam2Path = basePath + "/spam_2"
val easyHamPath = basePath + "/easy_ham"
val easyHam2Path = basePath + "/easy_ham_2"
val amountOfSamplesPerSet = 500
val amountOfFeaturesToTake = 100
//First get a subset of the filenames for the spam
// sample set (500 is the complete set in this case)
val listOfSpamFiles = getFilesFromDir(spamPath)
.take(amountOfSamplesPerSet)
//Then get the messages that are contained in these files
val spamMails = listOfSpamFiles.map(x => (x, getMessage(x)))
//Get a subset of the filenames from the ham sample set
// (note that in this case it is not necessary to randomly
// sample as the emails are already randomly ordered)
val listOfHamFiles = getFilesFromDir(easyHamPath)
.take(amountOfSamplesPerSet)
//Get the messages that are contained in the ham files
val hamMails = listOfHamFiles
.map{x => (x,getMessage(x)) }
}

复制代码

回复

使用道具举报

9楼

soccy 发表于 2016-4-21 10:09:29 |只看作者 |坛友微信交流群

......

回复

使用道具举报

10楼

80丶90年代 发表于 2016-4-21 10:47:05 |只看作者 |坛友微信交流群

别问我为什么就为了签个到

回复

使用道具举报

发帖

本版微信群

加JingGuanBbs
拉您进交流群

如有投资本站、合作意向或投放广告，请联系：13661292478（刘老师）

联系客服

邮箱：service@pinggu.org 投诉或不良信息处理：（010-68466864）

京ICP备16021002-2号京B2-20170662号京公网安备 11010802022788号论坛法律顾问：王进律师知识产权保护声明免责及隐私声明