25万家餐馆50万用户500万条评论评分数据自然语言处理数据集NLP
(近2GB的csv数据格式文件)
包含的数据说明如下:
餐馆.csv
"## 字段说明\n",
"\n",
"| 字段 | 说明 |\n",
"| ---- | ---- |\n",
"| restId | 餐馆 id (从 0 开始,连续编号) |\n",
"| name | 餐馆名称 |"
" restId name\n",
"210902 210902 NaN\n",
"124832 124832 NaN\n",
"26766 26766 香锅制造(新苏天地店)\n",
"91754 91754 NaN\n",
"204465 204465 西部牛扒城(湖塘店)\n",
"36475 36475 NaN\n",
"231861 231861 四季火锅\n",
评价.csv
"## 字段说明\n",
"\n",
"| 字段 | 说明 |\n",
"| ---- | ---- |\n",
"| userId | 用户 id (从 0 开始,连续编号) |\n",
"| restId | 即 restaurants.csv 中的 restId |\n",
"| rating | 总体评分,[0,5] 之间的整数 |\n",
"| rating_env | 环境评分,[1,5] 之间的整数 |\n",
"| rating_flavor | 口味评分,[1,5] 之间的整数 |\n",
"| rating_service | 服务评分,[1,5] 之间的整数 |\n",
"| timestamp | 评分时间戳 |\n",
"| comment | 评论内容 |"
" userId restId rating rating_env rating_flavor rating_service \\\n",
"3331708 6802 183728 3.0 3.0 4.0 3.0 \n",
"3332473 3106 183750 5.0 4.0 4.0 4.0 \n",
"291609 39590 13570 3.0 3.0 2.0 3.0 \n",
"749582 59192 38519 4.0 2.0 3.0 2.0 \n",
"719908 241643 36382 1.0 2.0 1.0 1.0 \n",
"3127953 12481 173459 4.0 3.0 3.0 3.0 \n",
"2068253 13070 115853 3.0 3.0 3.0 2.0 \n",
"640356 168006 33263 NaN 3.0 5.0 3.0 \n",
"1222261 76280 65171 3.0 2.0 2.0 2.0 \n",
"101366 67372 2853 1.0 1.0 1.0 1.0 \n",
"\n",
" timestamp comment \n",
"3331708 1315673880000 环境不错,停车方便,交通也比较方便,东西齐全,应有尽有,吃、喝、玩、乐样样齐全,还有个五星级... \n",
"3332473 1260155880000 去过两次,都是由日本朋友带着去的,很喜欢那种在小巷子深处的店,总觉得那样的店料理会很好吃。最... \n",
"291609 1324792500000 朋友请客,两个人中午去吃的,虽然不是节假日,但人还是非常的多,等了很长时间才上餐,价位偏高,... \n",
"749582 1321430760000 十一长假之前,我们的房子终于有了好消息,这个月底就可以拿到钥匙,真是不容易,盼星星盼月亮的,... \n",
"719908 1271862180000 很差的一家店!公司聚餐居然选在这里,真是个大大的失策!\\n点的菜迟迟不上,不知道是故意不上还... \n",
.....
关联.csv
"## 字段说明\n",
"\n",
"| 字段 | 说明 |\n",
"| ---- | ---- |\n",
"| restId | 即 restaurants.csv 和 ratings.csv 中的 restId |\n",
"| dianpingId | 大众点评网的餐馆编号 |"
" restId dianpingId\n",
"138492 138492 3566359\n",
"158007 158007 2484433\n",
"16170 16170 3651451\n",
"116637 116637 5143029\n",
"191554 191554 2734621\n",