楼主: gqb_345
28025 130

[数据挖掘理论与案例] Wiley 09年新书《Make Sense of Data II》 [推广有奖]

  • 1关注
  • 1粉丝

老K

已卖:1393份资源

副教授

10%

还不是VIP/贵宾

-

威望
0
论坛币
1690 个
通用积分
27.6339
学术水平
14 点
热心指数
3 点
信用等级
9 点
经验
3685 点
帖子
108
精华
1
在线时间
1201 小时
注册时间
2006-5-30
最后登录
2025-10-20

楼主
gqb_345 发表于 2009-8-14 09:53:20 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Making Sense of Data II: A Practical Guide to Data Visualization, Advanced Data Mining Methods, and Applications


Publisher: Wiley
Language: English
ISBN: 0470222808
Paperback: 291 pages
Data: Feb 2009
Format: PDF
Description: The Making Sense of Data series fills a current gap in the market for easy-to-use books for non-specialists that combine advanced data mining methods, the application of these methods to a range of fields, and hands-on tutorials. Making Sense of Data II: A Practical
Guide to Data Visualization, Advanced Data Mining Methods, and Applications offers a comprehensive collection of advanced data mining methods coupled with tutorials for applications in a range of fields including business and finance. This book is appropriate for students and professionals in the many different disciplines involving making decisions from data.



CONTENTS
PREFACE xi
1 INTRODUCTION 1
1.1 Overview 1
1.2 Definition 1
1.3 Preparation 2
1.3.1 Overview 2
1.3.2 Accessing Tabular Data 3
1.3.3 Accessing Unstructured Data 3
1.3.4 Understanding the Variables and Observations 3
1.3.5 Data Cleaning 6
1.3.6 Transformation 7
1.3.7 Variable Reduction 9
1.3.8 Segmentation 10
1.3.9 Preparing Data to Apply 10
1.4 Analysis 11
1.4.1 Data Mining Tasks 11
1.4.2 Optimization 12
1.4.3 Evaluation 12
1.4.4 Model Forensics 13
1.5 Deployment 13
1.6 Outline of Book 14
1.6.1 Overview 14
1.6.2 Data Visualization 14
1.6.3 Clustering 15
1.6.4 Predictive Analytics 15
1.6.5 Applications 16
1.6.6 Software 16
1.7 Summary 16
1.8 Further Reading 17
2 DATA VISUALIZATION 19
2.1 Overview 19
2.2 Visualization Design Principles 20
2.2.1 General Principles 20
2.2.2 Graphics Design 23
2.2.3 Anatomy of a Graph 28
v
2.3 Tables 32
2.3.1 Simple Tables 32
2.3.2 Summary Tables 33
2.3.3 Two-Way Contingency Tables 34
2.3.4 Supertables 34
2.4 Univariate Data Visualization 36
2.4.1 Bar Chart 36
2.4.2 Histograms 37
2.4.3 Frequency Polygram 41
2.4.4 Box Plots 41
2.4.5 Dot Plot 43
2.4.6 Stem-and-Leaf Plot 44
2.4.7 Quantile Plot 46
2.4.8 Quantile–Quantile Plot 48
2.5 Bivariate Data Visualization 49
2.5.1 Scatterplot 49
2.6 Multivariate Data Visualization 50
2.6.1 Histogram Matrix 52
2.6.2 Scatterplot Matrix 54
2.6.3 Multiple Box Plot 56
2.6.4 Trellis Plot 56
2.7 Visualizing Groups 59
2.7.1 Dendrograms 59
2.7.2 Decision Trees 60
2.7.3 Cluster Image Maps 60
2.8 Dynamic Techniques 63
2.8.1 Overview 63
2.8.2 Data Brushing 64
2.8.3 Nearness Selection 65
2.8.4 Sorting and Rearranging 65
2.8.5 Searching and Filtering 65
2.9 Summary 65
2.10 Further Reading 66
3 CLUSTERING 67
3.1 Overview 67
3.2 Distance Measures 75
3.2.1 Overview 75
3.2.2 Numeric Distance Measures 77
3.2.3 Binary Distance Measures 79
3.2.4 Mixed Variables 84
3.2.5 Other Measures 86
3.3 Agglomerative Hierarchical Clustering 87
3.3.1 Overview 87
3.3.2 Single Linkage 88
3.3.3 Complete Linkage 92
3.3.4 Average Linkage 93
3.3.5 Other Methods 96
3.3.6 Selecting Groups 96
vi CONTENTS
3.4 Partitioned-Based Clustering 98
3.4.1 Overview 98
3.4.2 k-Means 98
3.4.3 Worked Example 100
3.4.4 Miscellaneous Partitioned-Based Clustering 101
3.5 Fuzzy Clustering 103
3.5.1 Overview 103
3.5.2 Fuzzy k-Means 103
3.5.3 Worked Examples 104
3.6 Summary 109
3.7 Further Reading 110
4 PREDICTIVE ANALYTICS 111
4.1 Overview 111
4.1.1 Predictive Modeling 111
4.1.2 Testing Model Accuracy 116
4.1.3 Evaluating Regression Models’ Predictive Accuracy 117
4.1.4 Evaluating Classification Models’ Predictive Accuracy 119
4.1.5 Evaluating Binary Models’ Predictive Accuracy 120
4.1.6 ROC Charts 122
4.1.7 Lift Chart 124
4.2 Principal Component Analysis 126
4.2.1 Overview 126
4.2.2 Principal Components 126
4.2.3 Generating Principal Components 127
4.2.4 Interpretation of Principal Components 128
4.3 Multiple Linear Regression 130
4.3.1 Overview 130
4.3.2 Generating Models 133
4.3.3 Prediction 136
4.3.4 Analysis of Residuals 136
4.3.5 Standard Error 139
4.3.6 Coefficient of Multiple Determination 140
4.3.7 Testing the Model Significance 142
4.3.8 Selecting and Transforming Variables 143
4.4 Discriminant Analysis 145
4.4.1 Overview 145
4.4.2 Discriminant Function 146
4.4.3 Discriminant Analysis Example 146
4.5 Logistic Regression 151
4.5.1 Overview 151
4.5.2 Logistic Regression Formula 151
4.5.3 Estimating Coefficients 153
4.5.4 Assessing and Optimizing Results 156
4.6 Naive Bayes Classifiers 157
4.6.1 Overview 157
4.6.2 Bayes Theorem and the Independence Assumption 158
4.6.3 Independence Assumption 158
4.6.4 Classification Process 159
CONTENTS vii
4.7 Summary 161
4.8 Further Reading 163
5 APPLICATIONS 165
5.1 Overview 165
5.2 Sales and Marketing 166
5.3 Industry-Specific Data Mining 169
5.3.1 Finance 169
5.3.2 Insurance 171
5.3.3 Retail 172
5.3.4 Telecommunications 173
5.3.5 Manufacturing 174
5.3.6 Entertainment 175
5.3.7 Government 176
5.3.8 Pharmaceuticals 177
5.3.9 Healthcare 179
5.4 microRNA Data Analysis Case Study 181
5.4.1 Defining the Problem 181
5.4.2 Preparing the Data 181
5.4.3 Analysis 183
5.5 Credit Scoring Case Study 192
5.5.1 Defining the Problem 192
5.5.2 Preparing the Data 192
5.5.3 Analysis 199
5.5.4 Deployment 203
5.6 Data Mining Nontabular Data 203
5.6.1 Overview 203
5.6.2 Data Mining Chemical Data 203
5.6.3 Data Mining Text 210
5.7 Further Reading 213
APPENDIX A MATRICES 215
A.1 Overview of Matrices 215
A.2 Matrix Addition 215
A.3 Matrix Multiplication 216
A.4 Transpose of a Matrix 217
A.5 Inverse of a Matrix 217
APPENDIX B SOFTWARE 219
B.1 Software Overview 219
B.1.1 Software Objectives 219
B.1.2 Access and Installation 221
B.1.3 User Interface Overview 221
B.2 Data Preparation 223
B.2.1 Overview 223
B.2.2 Reading in Data 224
B.2.3 Searching the Data 225
viii CONTENTS
B.2.4 Variable Characterization 227
B.2.5 Removing Observations and Variables 228
B.2.6 Cleaning the Data 228
B.2.7 Transforming the Data 230
B.2.8 Segmentation 235
B.2.9 Principal Component Analysis 236
B.3 Tables and Graphs 238
B.3.1 Overview 238
B.3.2 Contingency Tables 239
B.3.3 Summary Tables 240
B.3.4 Graphs 242
B.3.5 Graph Matrices 246
B.4 Statistics 246
B.4.1 Overview 246
B.4.2 Descriptive Statistics 248
B.4.3 Confidence Intervals 248
B.4.4 Hypothesis Tests 249
B.4.5 Chi-Square Test 250
B.4.6 ANOVA 251
B.4.7 Comparative Statistics 251
B.5 Grouping 253
B.5.1 Overview 253
B.5.2 Clustering 254
B.5.3 Associative Rules 257
B.5.4 Decision Trees 258
B.6 Prediction 261
B.6.1 Overview 261
B.6.2 Linear Regression 263
B.6.3 Discriminant Analysis 265
B.6.4 Logistic Regression 266
B.6.5 Naive Bayes 267
B.6.6 kNN 269
B.6.7 CART 269
B.6.8 Neural Networks 270
B.6.9 Apply Model 271
BIBLIOGRAPHY 273
INDEX 279
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Wiley sense Data make Sens Mining Data

Making Sense of Data II.pdf
下载链接: https://bbs.pinggu.org/a-380309.html

12.9 MB

需要: 5 个论坛币  [购买]

已有 4 人评分经验 论坛币 学术水平 热心指数 信用等级 收起 理由
狂热的爱好者 + 60 + 3 + 2 + 3 精彩帖子
jw_hero + 60 不错
urdaddy + 1 对论坛有贡献
420948492 + 60 + 1 好材料

总评分: 经验 + 60  论坛币 + 120  学术水平 + 5  热心指数 + 2  信用等级 + 3   查看全部评分

本帖被以下文库推荐

沙发
luhuidal(真实交易用户) 发表于 2009-8-14 11:19:19
不错的新书,讲解清楚,值得一读,谢谢分享

藤椅
灿灿(未真实交易用户) 发表于 2009-9-20 12:07:27
能便宜点不?

板凳
husan88(真实交易用户) 发表于 2009-9-20 16:26:30
提示: 作者被禁止或删除 内容自动屏蔽

报纸
420948492(未真实交易用户) 发表于 2009-9-20 23:56:43
很好的材料,呵呵
有人的地方就有江湖

地板
zhongzihong(真实交易用户) 发表于 2009-9-21 08:21:01
thank u!!!!!!!!!!!!!!!
曾经错过

7
trdzw(真实交易用户) 发表于 2009-9-21 09:29:51
很好的材料,呵呵

8
ivince68(真实交易用户) 发表于 2009-9-21 09:30:13
不错,谢谢分享

9
hr1230(真实交易用户) 发表于 2009-9-21 09:51:28
顶一下,谢谢
aaa

10
hxin3(真实交易用户) 发表于 2009-9-21 10:30:33
很好的材料, 我买了。

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-30 11:53