楼主: oliyiyi
1327 6

Data Science, Machine Learning, BI Explained in a Amazing Few Pictures [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
271951 个
通用积分
31269.3519
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383775 点
帖子
9598
精华
66
在线时间
5468 小时
注册时间
2007-5-21
最后登录
2024-4-18

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

This article brings images from my work modeling with Mathematica, my experience as a Business Analyst and also my doctorate lessons. For me, the borders between a properly executed Business Intelligence and Data Science (with substantive knowledge in Management) are fuzzy. See the picture below:


[color=rgb(255, 255, 255) !important]


What is a Data Scientist ? In my understanding, someone can be a data scientist according to his domain expertise: Business management, physics, computer science, etc.


[color=rgb(255, 255, 255) !important]








[color=rgb(255, 255, 255) !important]


DATA SCIENCE AND BUSINESS INTELLIGENCE PHASES


[color=rgb(255, 255, 255) !important]


1) UNDERSTAND PROCESSES

First of all, really understand the context, processes of the business: familiarity with technology, employees and daily routine


2) FINANCIAL ANALYSIS

Second, establish business needs (among them, $$$).

- Sales/Revenue

- Net Worth

- Gross margin

- Net profit

- Losses

- Indexes: ROI, ROA, ROE, EBITDA, inventory turnover, liquidity, financial leverage, debt, assets and liabilities (short term and long term), horizontal and vertical balance analysis


[color=rgb(255, 255, 255) !important]



3) DEFINE DATABASE ARCHITECTURE AND METHODOLOGY OF DATA COLLECTION AND EXTRACTION

Third: a) Define database architecture to provide functionality, reliability, security and ability to provide valuable data for decision making.

b) establish a methodology of data collection, sampling and market research, sources of data and KPIs in order to get a reliable data analysis provided with validity.


[color=rgb(255, 255, 255) !important]



4) COLLECT DATA

From different sources:

a) Customized market research

b) CRM Database: sales, clients, suppliers and processes

c) Website

d) Online Advertising

e) Employees

f) Big Data

- Facial recognition

- Speech recognition

- Unstructured data

- Structured data

- Images

- Social Media


5) ANALYZE DATA

You can use Excel, R, SAS, Mathematica, SPSS, Pyhton

5.0. Data preparation: work on missing values, outliers (I usually analyze deeply individuals with values more than 3 standard deviations), normality of data, skewness (the 1/N trick), kurtosis (the log trick), sampling. Prepare data properly so that you can have a reliable analysis.

5.1. Descriptive statistics:

a) Market Research and Database: quality perception, source of clients, demographics, sales, profit, repurchase intentions, profitable clients, profitability per sales channel, losses, evolution of KPIs over time, sales per state/neighborhood, efficiency of employees and sales force, employee performance


[color=rgb(255, 255, 255) !important]


b) Social Media: popularity, sentiment analysis, references, associations, conversions, mentions, influencers. You can use Python for unstructured data analysis (text).

c) Website: visits, paths, time spent, clients' demographics, OS, enter pages, leave pages, contact forms filled, popularity, page rank

d) Online advertising: bids, keywords, conversion rate, effective contacts, ROI, clients' demographics, competition strategy

  • 5.2.  Multivariate statistics: correlations , factor analysis, linear regression: identify niches, causes for profit / loss / sales / satisfaction / quality perception / popularity, most relevant variables, customer demographics, groups, do market segmentation, sentiment analysis, guide sales strategy, refine KPI's and customize business offer to clients' needs.
  • 5.3. Classification algorithms in predictive analysis (naive bayes, random forest, linear and logistic regression and K nearest neighbors): identify niches, causes for sentiment analysis, do market segmentation, customize business offer, define marketing mix, identify purchase patterns, guide sales team, identify social groups and predict future business outcomes.
  • [color=rgb(255, 255, 255) !important]

  • K Nearest Neighbors
  • [color=rgb(255, 255, 255) !important]

  • 5.4.  Optimization algorithms (linear and non-linear programming, genetic algorithms and neural networks): identify most efficient and profitable marketing mix, consider seasonality of demand and improvements in processes, enhance internal processes, optimize sales strategy, R&D efforts.
  • 5.5.  Clustering algorithms: K means and hierarchical clustering: to identify niches, customize business offering, identify social traits and guide sales team.
  • 5.6. Semantic understanding of the context, between data and customer actions, interactions, social networks dynamics. This is obtained through analysis of all sources of data mentioned above. Graphic visualizations and simulations help a lot to understand the dynamics of a group of people. Below you can see my Mathematica models. Read the full post on social networks here:
  • https://www.linkedin.com/pulse/contagion-social-network-rubens-zimbres

And here: https://www.linkedin.com/pulse/social-network-analysis-based-callsemails-rubens-zimbres

[color=rgb(255, 255, 255) !important]



[color=rgb(255, 255, 255) !important]


6) DEVELOP SIMULATION MODELS

  • Simulation (Markov chains, cellular automata and agent-based modeling): simulate dynamically market conditions and customer behavior to predict future outcomes, semantic (graphic) understanding of customers social networks, online behavior, interactions, patterns of purchase and evolution of opinions over time and interactions. The image below shows a cellular automata model evolving over time. Each color is a different cell state.
  • [color=rgb(255, 255, 255) !important]

  • Machine Learning: supervised (to establish a training set based on data from the past and predict future outcomes, like purchase intentions, sales and face recognition and sentiment analysis based on images) and unsupervised (to simulate customers interaction and emergence of complex consumption patterns). Read the full post on Facial Recognition here:
  • https://www.linkedin.com/pulse/facial-recognition-wolfram-mathematica-rubens-zimbres
  • [color=rgb(255, 255, 255) !important]

  • Validity: Data and simulation models must be analysed regarding their validity: nomological, internal and external validity, content and construct validity, its ergodicity and homoscedasticity.

[color=rgb(255, 255, 255) !important]









二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Data Science Explained Learning pictures machine Pictures

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html
沙发
kmjiwa 发表于 2017-2-28 20:48:22 |只看作者 |坛友微信交流群
Let me see.

使用道具

藤椅
fengyg 企业认证  发表于 2017-2-28 20:52:30 |只看作者 |坛友微信交流群
kankan

使用道具

板凳
MouJack007 发表于 2017-3-1 04:16:16 |只看作者 |坛友微信交流群
谢谢楼主分享!

使用道具

报纸
MouJack007 发表于 2017-3-1 04:16:39 |只看作者 |坛友微信交流群

使用道具

000000

使用道具

7
ithjesuxf 发表于 2017-3-1 12:25:38 |只看作者 |坛友微信交流群
谢谢楼主分享!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-19 11:37