楼主: Lisrelchen
848 1

[Lecture Notes]Introduction to Data Science I, University of Maryland [推广有奖]

  • 0关注
  • 62粉丝

VIP

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
49957 个
通用积分
79.5487
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Introduction to Data Science I: Preparing, Storing, and Manipulating Data


Following is a tentative schedule of the topics we plan to cover and what the assignements will focus on. More details will be added as the course progresses.Note about assignments: One goal of this class is to get you to be comfortable with using a wide variety of tools (most of which are listed below). You are NOT expected to learn these tools on your own; we will provide step-by-step guidance on getting started with the tools and the actual assignments will be simple.
Note about readings: The links to the two textbooks can only be accessed when you are on the UMD network, because UMD has subscription for the Safari Online Books service.
[url=]If you are enrolled in the CMSC828, click here to see the assigned readings[/url]

[td]
DateLecture Topics and MaterialsAssignments
Tue 9/2Introduction: What is data science. Major tools used by data scientists. Class overview.
Lecture Notes.Readings:
References:
Lab 0: Basic usage of github, VirtualBox, IPython Notebook (Due 9/12)
Thu 9/4Basic Statistics: statistical tests, samples, fallacies.
Lecture Notes.Readings:
References:
Tue 9/9Basic Statistics: linear regression, classification, clustering.
Lecture Notes.Readings:
Lab 1: Python basic stats and plotting (Due 9/19)
Thu 9/11Data Models: Overview, Why modeling is essential, Commonly used models (Relational, JSON, Protocol Buffers)
Lecture Notes.
Tue 9/16Relational Databases, SQL
Lecture Notes.
Lab 2: Basic SQL; Python Pandas and Dataframes; Avro (Due 10/3)
Thu 9/18(cntd)
Tue 9/23(cntd)
Thu 9/25Data scraping and wrangling, Unix tools, GUIs
Lecture Notes.
Lab 3: Advanced SQL and Pandas (Due 10/10)
Tue 9/30(cntd)
Thu 10/2Data Integration: Overview, Schema mapping, Entity Resolution
(Lecture Notes Continued)
Lab 4: Data cleaning using unix tools, Data Wrangler (Due 10/17)
Tue 10/7(cntd)
Thu 10/9Information Extraction: Overview, Key Techniques
(Lecture Notes Continued)
Lab 5: Entity Resolution and Information Extraction (Due 10/28)
Tue 10/14Implementation of Relational Databases
Lecture Notes.
Thu 10/16(cntd)
Tue 10/21Distributed programming frameworks: Parallel Databases, MapReduce, Apache Spark, Hadoop Ecosystem Lecture Notes.Lab 6: Hadoop, Spark (Due: 11/7)
Thu 10/23MIDTERM
Tue 10/28(Cntd Distributed Programming Frameworks)
Thu 10/30(cntd)
Tue 11/4(cntd)Lab 7: Cassandra and MongoDB (Due: 11/17)
Thu 11/6(cntd)
Tue 11/11Key-value stores: Basics, Differences from Relational Databases, Consistency/Replication issues
Lecture Notes.
Lab 8: Spark Streaming, Storm (Due: 11/26)
Thu 11/13(cntd)
Tue 11/18Visualization: D3.js (see Lab 10 for notes)
Thu 11/20Data streaming/Real-time analytics: Data streams in relational databases, Spark Streaming, StormLecture Notes.Lab 9: Neo4j, GraphX (Due: 12/8)
Tue 11/25(cntd)
Tue 12/2Graph Databases and Graph Analytics
Lecture Notes.
Lab 10: D3 (Due: 12/11)
Thu 12/4(cntd)
Tue 12/9Cloud computing: Overview, Virtualization, Data centers, Platform/Infrastrcture-as-a-Service Lecture Notes.
Thu 12/11(cntd)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:introduction Data Science troduction University Universit University expected guidance schedule details

本帖被以下文库推荐

沙发
benji427 在职认证  发表于 2016-3-29 08:57:06 |只看作者 |坛友微信交流群
谢谢楼主分享

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-30 18:21