楼主: burnpark
931 1

[问答] 数据读取问题 [推广有奖]

  • 1关注
  • 2粉丝

已卖:1份资源

讲师

21%

还不是VIP/贵宾

-

威望
0
论坛币
2979 个
通用积分
6.6000
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
4270 点
帖子
128
精华
0
在线时间
591 小时
注册时间
2009-3-17
最后登录
2024-10-13

楼主
burnpark 发表于 2019-7-16 10:08:59 |AI写论文
50论坛币
我按照下面例题读取数据可是到了“2. Set last day Adjusted Close as y”这步,数据总是缺少最后一行。不知道为什么求高人解答。读取的源数据df可以看到最后一行值都是1.00000,可是转换到y_test之后最后一行就不见了。


代码如下:


  1. import numpy as np
  2. import matplotlib.pyplot as plt
  3. import matplotlib.pyplot as plt2
  4. import pandas as pd
  5. from pandas import datetime
  6. import math, time
  7. import itertools
  8. from sklearn import preprocessing
  9. import datetime
  10. from sklearn.metrics import mean_squared_error
  11. from math import sqrt
  12. from keras.models import Sequential
  13. from keras.layers.core import Dense, Dropout, Activation
  14. from keras.layers.recurrent import LSTM
  15. from keras.models import load_model
  16. import keras
  17. import pandas_datareader.data as web
  18. import h5py
复制代码

  1. def get_stock_data(stock_name, normalize=True):
  2.     start = datetime.datetime(1971, 1, 1)
  3.     end = datetime.date.today()
  4.     df = web.DataReader(stock_name, "yahoo", start, end)
  5.     df.drop(['Volume', 'Close'], 1, inplace=True)

  6.     if normalize:        
  7.         min_max_scaler = preprocessing.MinMaxScaler()
  8.         df['Open'] = min_max_scaler.fit_transform(df.Open.values.reshape(-1,1))
  9.         df['High'] = min_max_scaler.fit_transform(df.High.values.reshape(-1,1))
  10.         df['Low'] = min_max_scaler.fit_transform(df.Low.values.reshape(-1,1))
  11.         df['Adj Close'] = min_max_scaler.fit_transform(df['Adj Close'].values.reshape(-1,1))
  12.     return df


  13. df = get_stock_data(stock_name, normalize=True)

  14. df
复制代码

2. Set last day Adjusted Close as y


  1. def load_data(stock, seq_len):
  2.     amount_of_features = len(stock.columns)
  3.     data = stock.as_matrix()
  4.     sequence_length = seq_len + 1 # index starting from 0
  5.     result = []

  6.     for index in range(len(data) - sequence_length): # maxmimum date = lastest date - sequence length
  7.         result.append(data[index: index + sequence_length]) # index : index + 22days

  8.     result = np.array(result)
  9.     row = round(0.9 * result.shape[0]) # 90% split

  10.     train = result[:int(row), :] # 90% date
  11.     X_train = train[:, :-1] # all data until day m
  12.     y_train = train[:, -1][:,-1] # day m + 1 adjusted close price

  13.     X_test = result[int(row):, :-1]
  14.     y_test = result[int(row):, -1][:,-1]

  15.     X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], amount_of_features))
  16.     X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], amount_of_features))  

  17.     return [X_train, y_train, X_test, y_test]

  18. y_test
复制代码



关键词:数据读取 Matplotlib Processing Activation Sequential

沙发
詹惠儿 发表于 2019-8-2 14:13:01
您好,如果您的求助没有解决,请到项目交易发布需求,会有更快更专业的用户帮助您 https://bbs.pinggu.org/z_prj.php

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-30 22:55