楼主: dawnsense
1300 2

[问答] An example of market analysis and conjoint analysis of Python [推广有奖]

  • 0关注
  • 5粉丝

已卖:517份资源

硕士生

35%

还不是VIP/贵宾

-

威望
0
论坛币
17861 个
通用积分
8.0331
学术水平
13 点
热心指数
9 点
信用等级
8 点
经验
1015 点
帖子
143
精华
0
在线时间
127 小时
注册时间
2014-9-26
最后登录
2023-6-27

楼主
dawnsense 发表于 2019-1-30 23:37:14 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Hi everyone, I am going to share a case analysis about market analysis and conjoint analysis. From this example, you can learn to use basic statistical output and learn how to use cluster analysis. The dataset is golf.csv. If you need this data set, please contact me and I will be happy to share it with you.If you like, remember to give me a thumb up oh!

The dataset is the information about some golf course manufacturing costs, courses, etc. Through this example, we can learn how to observe statistical data and perform a simple cluster analysis to observe clustering.

##

%cd /Users/shimonyagrawal/Desktop

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import numpy as np

golf = pd.read_csv('golf.csv')

#Why will courseID not be relevant in a clustering model?

golf = golf.drop('courseID', 1)

golf

The course ID is not relevant in clustering just number indicating unique identifier. This variable will not yeild any results in the analysis since it's just a number depicting each response by golf course vendors. Here, it does not have any significant value required for analysis.

elevation

square_feet

est_playing_time

land_obstacles

water_obstacles

tunnel_shots

est_construction_cost

est_maintenance_cost

average_hole_length

average_hole_width

0

11.6421037.1843.3510.03.03.0103082.727261.2218.993.90

1

6.5823646.4442.3010.04.03.091637.936553.9121.352.49

2

11.0820012.2841.439.03.03.0107049.475847.0619.092.63

3

9.9120761.9046.0410.04.03.0101799.558876.0119.363.51

4

11.9919818.7544.827.06.04.094731.848445.7016.812.67

...

..............................

245

10.4523963.7351.468.04.03.099027.529333.8120.392.63

246

13.7723337.7751.399.03.04.061096.937864.5017.062.93

247

7.0123951.8341.967.03.02.0106438.632745.8118.132.53

248

8.7023850.6945.106.03.03.098163.767955.8120.323.79

249

10.2626820.4147.588.05.03.092221.4210027.7520.282.66

##Call the describe() function on your dataset.
golf.describe()The describe function gives the descriptive statistics of the variables which summarises the distribution of the variables in the datastet. Summary statistics give a quantitative analysis of the data which can be useful in simplifying large amount of data. Using the summary statistics, the analyst can identify outliers as well as where the data is skewed.
[td]

elevation

square_feet

est_playing_time

land_obstacles

water_obstacles

tunnel_shots

est_construction_cost

est_maintenance_cost

average_hole_length

average_hole_width

count

250.00000250.000000250.000000250.000250.000000250.000000250.000000250.000000250.000000250.000000

mean

10.9034822052.67760044.9161207.8403.9680002.94400094956.1602007779.28816019.7572802.964680

std

2.523902708.1774785.0011461.5440.7754880.59857911656.5244921990.5365821.7506930.459777

min

2.9200014357.48000031.6300003.0002.0000002.00000061096.9300002682.46000014.6900001.640000

25%

9.4525020162.53250041.4300007.0003.0000003.00000086997.5250006527.24250018.5600002.632500

50%

11.0850022030.71500044.7800008.0004.0000003.00000094727.8800007760.00000019.7800002.990000

75%

12.7050023974.86750048.1575009.0004.0000003.000000102375.6400008935.88750020.9250003.300000

max

17.7700029712.52000058.02000014.0006.0000004.000000126247.72000012589.80000025.4900004.210000
##Build a k-means model.
from sklearn.cluster import KMeans
kmeans_model = KMeans(n_clusters = 3, random_state = 101)
kmeans_model.fit(golf_normalize)
cluster_labels = kmeans_model.labels_
golf_cluster = golf.assign(Cluster = cluster_labels)
grouped = golf_cluster.groupby(['Cluster'])
grouped.agg({
'square_feet': 'mean',
'est_construction_cost' : 'mean',
'est_maintenance_cost' : 'mean'}).round(2)
[td]


square_feet

est_construction_cost

est_maintenance_cost

Cluster




0
24754.4184712.527551.36

1

19985.26105199.567382.96

2

22149.6292833.788187.60
golf_cluster.head()
[td]

elevation

square_feet

est_playing_time

land_obstacles

water_obstacles

tunnel_shots

est_construction_cost

est_maintenance_cost

average_hole_length

average_hole_width

Cluster

0

11.6421037.1843.3510.03.03.0103082.727261.2218.993.901

1

6.5823646.4442.3010.04.03.091637.936553.9121.352.492

2

11.0820012.2841.439.03.03.0107049.475847.0619.092.631

3

9.9120761.9046.0410.04.03.0101799.558876.0119.363.511

4

11.9919818.7544.827.06.04.094731.848445.7016.812.671

This completes the implementation of Cluster. Finally, cluster analysis should be summarized.
golf_cluster = golf.assign(Cluster = cluster_labels)
grouped = golf_cluster.groupby(['Cluster'])
grouped.agg({
'square_feet': 'mean',
'est_construction_cost' : 'mean',
'est_maintenance_cost' : 'mean'}).round(2)

[td]

square_feet

est_construction_cost

est_maintenance_cost

Cluster




0
24754.4184712.527551.36

1

19985.26105199.567382.96

2

22149.6292833.788187.60


If you like, remember to give me thumb up oh!



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:EVIEWS Views Eview 线性回归 view

下载.jpeg (3.49 KB)

下载.jpeg

下载.jpeg (3.49 KB)

下载.jpeg

沙发
admin_kefu 发表于 2019-2-3 11:49:16
您好,如果您的求助没有解决,请到项目交易发布需求,会有更快更专业的用户帮助您 https://bbs.pinggu.org/prj/

藤椅
dawnsense 发表于 2019-9-28 00:40:44

o

If you have any other questions, please reply to this post!

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-3 12:24