阅读权限 255 威望 1 级论坛币 49392 个 通用积分 51.6904 学术水平 370 点 热心指数 273 点 信用等级 335 点 经验 57815 点 帖子 4006 精华 21 在线时间 582 小时 注册时间 2005-5-8 最后登录 2023-11-26
Scaling
Making features have approximately zero mean by replacing each field x with x-m, and values within an unit standard deviation by dividing the range of feature.+
Import
import org.apache.spark.mllib.feature.StandardScaler
Code
val scaler = new StandardScaler(withMean = true, withStd = true).fit(trainingSet.map(dp => dp.features))
Scale the training and test set.
val scaledTrainingSet = trainingSet.map(dp => new LabeledPoint(dp.label, scaler.transform(dp.features))).cache()
val scaledTestSet = testSet.map(dp => new LabeledPoint(dp.label, scaler.transform(dp.features))).cache() 复制代码