请教大家一个问题:
我参考这个教程: http://www.feiguyunai.com/index.php/2017/11/19/pythonai-sklearn-customer-loss01/,用jupyter进行客户流失预测的分析,但是在交叉验证部分出现错误。主要步骤如下:
先定义函数run_cv:
from sklearn.model_selection import KFold
def run_cv(X,y,clf_class,**kwargs):
# Construct a kfolds object
kf = KFold(len(y),n_splits=5,shuffle=True)
y_pred = y.copy()
# Iterate through folds
for train_index, test_index in kf:
X_train, X_test = X[train_index], X[test_index]
y_train = y[train_index]
# Initialize a classifier with key word arguments
clf = clf_class(**kwargs)
clf.fit(X_train,y_train)
y_pred[test_index] = clf.predict(X_test)
return y_pred
然后运行:
def accuracy(y_true,y_pred):
# NumPy interpretes True and False as 1. and 0.
return np.mean(y_true == y_pred)
print( "Logistic Regression:")
print( "%.3f" % accuracy(y, run_cv(X,y,LR)))
报错如下:
Logistic Regression:
---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-11-44c2d31e9f2a> in <module> 4 5 print( "Logistic Regression:")----> 6 print( "%.3f" % accuracy(y, run_cv(X,y,LR))) 7 print( "Gradient Boosting Classifier") 8 print( "%.3f" % accuracy(y, run_cv(X,y,GBC)))<ipython-input-9-d437442bb793> in run_cv(X, y, clf_class, **kwargs) 2 def run_cv(X,y,clf_class,**kwargs): 3 # Construct a kfolds object----> 4 kf = KFold(len(y),n_splits=5,shuffle=True) 5 y_pred = y.copy() 6 # Iterate through foldsTypeError: __init__() got multiple values for argument 'n_splits'
教程中kfold函数参数是n_folds,报错是没有这个参数;改为n_splits之后,报错就变成上面这样,想求教如何解决这个bug,谢谢解答。


雷达卡






京公网安备 11010802022788号







