楼主: xuehe
4131 18

[学科前沿] Python for Econometrics [推广有奖]

贵宾

已卖:14811份资源

学术权威

87%

还不是VIP/贵宾

-

威望
8
论坛币
577240 个
通用积分
483.4962
学术水平
370 点
热心指数
366 点
信用等级
207 点
经验
356305 点
帖子
4313
精华
8
在线时间
2646 小时
注册时间
2004-12-31
最后登录
2025-12-16

楼主
xuehe 发表于 2019-9-13 17:18:31 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Python for Econometrics

New material added to the third edition on January 3, 2017

Introduction to Python for Econometrics, Statistics and Numerical Analysis: Third Edition

Python is a widely used general purpose programming language, which happens to be well suited to econometrics, data analysis and other more general numeric problems. These notes provide an introduction to Python for a beginning programmer. They may also be useful for an experienced Python programmer interested in using NumPy, SciPy, matplotlib and pandas for numerical and statistical analysis (if this is the case, much of the beginning can be skipped).

Third edition update:

  • Rewritten installation section focused exclusively on using Continuum's Anaconda.
  • Python 3.5 is the default version of Python instead of 2.7. Python 3.5 (or newer) is well supported by the Python packages required to analyze data and perform statistical analysis, and bring some new useful features, such as a new operator for matrix multiplication (@).
  • Removed distinction between integers and longs in built-in data types chapter. This distinction is only relevant for Python 2.7.
  • dot has been removed from most examples and replaced with @ to produce more readable code.
  • Split Cython and Numba into separate chapters to highlight the improved capabilities of Numba.
  • Verified all code working on current versions of core libraries using Python 3.5.
  • pandas
    • Updated syntax of pandas functions such as resample.
    • Added pandas Categorical.
    • Expanded coverage of pandas groupby.
    • Expanded coverage of date and time data types and functions.
  • New chapter introducing statsmodels, a package that facilitates statistical analysis of data. statsmodels includes regression analysis, Generalized Linear Models (GLM) and time-series analysis using ARIMA models.

Second edition update:

  • Improved Cython and Numba sections
  • Added sections discussing interfacing with C code
  • Added sections to the chapter on running code in Parallel covering IPython's cluster server and joblib
  • Further improvements in the installation based on feedback from the Python Course
  • Updated Anaconda to 1.9
  • Added information about using Spyder as an initial IDE.
  • Added packages for Spyder to the installation instructions.

New in second edition:

  • The preferred installation method is now Continuum Analytics' Anaconda. Anaconda is a complete scientific stack and is available for all major platforms.
  • New chapter on pandas. pandas provides a simple but powerful tool to manage data and perform basic analysis. It also greatly simplifies importing and exporting data.
  • New chapter on advanced selection of elements from an array.
  • Numba provides just-in-time compilation for numeric Python code which often produces large performance gains when pure NumPy solutions are not available (e.g. looping code).
  • Addition to performance section covering line_profiler for profiling code.
  • Dictionary, set and tuple comprehensions.
  • Numerous typos fixed.
  • All code has been verified working against Anaconda 1.7.0.
Notes

Introduction to Python for Econometrics, Statistics and Numerical Analysis: Third Edition

Code

Code and Data for Introduction to Python for Econometrics, Statistics and Numerical Analysis
This is the code directly from the notes. It has been directly stripped from the master document, and allows for simple copy-and-paste execution.

Solutions for Introduction to Python for Econometrics, Statistics and Numerical Analysis
These solutions files contain answer to the exercises at the end of the chapters. They are formatted for IPython's Demo module, and instructions for use are located in the docstring.

Add Python to the Windows Registry
This file allows a particular Python installation to become the default by changing registry. It is useful for virtual environments and allows binary installers to be used with any location.

IPython Notebooks

Example: GARCH
Example: Fama-MacBeth Regression

Data

FTSE 1984-2012 (zipped csv)
Fama-French Data (zipped csv)

Video DemonstrationsSetupIPythonPython 爬虫分析2019年杭州国庆工作坊 & 课题申报高级研修
  • Core IPython - Key features of the IPython console including syntax highlighting, autocompletion, the command history and cell model.
  • IPython Magics - Magic keywords provide a wide range of features including on-the-fly configuration changes, file system manipulation, running Python programs and timing code.
  • Configuring IPython - Coming Soon. A brief introduction to customizing the IPython environment using configuration files.
IPython Notebook



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝


已有 4 人评分经验 论坛币 学术水平 热心指数 信用等级 收起 理由
dixiahe + 5 补偿
nuomin + 100 + 20 + 1 + 1 + 1 奖励积极上传好的资料
crystal8832 + 50 + 50 + 3 + 3 + 3 精彩帖子
statax + 20 + 5 精彩帖子

总评分: 经验 + 170  论坛币 + 75  学术水平 + 9  热心指数 + 4  信用等级 + 4   查看全部评分

沙发
xuehe 发表于 2019-9-13 17:22:25
Data Analysis in Python
Navigation
Why Python?
Note to R Users
Note to Stata Users
1. Setting Up Python
2. Basic Python
3. Pandas
4. Installing Packages
Econometrics
Machine learning
Plotting
GIS in Python
Network Analysis
Making Python faster
Big Data / Parallelization
Text Analysis
Getting Help
Teaching Programming
R-to-Python Table
ST: iPython
ST: Command Line
ST: Git and Github

藤椅
xuehe 发表于 2019-9-13 17:24:13
Machine learning

The primary library for Machine Learning in Python is scikit-learn, which has its own great tutorial page here.

If you’re wondering about the difference between statsmodels and scikit-learn, the answer is: there’s no easy answer.

statsmodels is primarily written for and by econometricians, while scikit-learn is primarily written for and by computer scientists and people doing machine learning. But the relationship between “econometrics” and “machine learning” is complicated. In very broad terms, machine learning tends to focus on prediction while econometrics tends to focus on testing hypotheses. But that’s somewhat simplistic.

The reason is that Econometrics and Machine Learning both developed when people in specific disciplines (economics and computer science respectively) branched off statistics to develop tools tailored for their own area. For several decades, econometrics and machine learning more or less developed independently and in parallel, each borrowing from statistics, but neither really paying attention to the other. As a result, there are some places where the two fields use the same tools but refer to them with different nomenclature, and other places where they actually do fundamentally different things.


板凳
xuehe 发表于 2019-9-13 17:29:03

报纸
xuehe 发表于 2019-9-13 17:43:22

In general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several attributes or features.

Learning problems fall into a few categories:

  • supervised learning, in which the data comes with additional attributes that we want to predict (Click here to go to the scikit-learn supervised learning page).This problem can be either:

    • classification: samples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data. An example of a classification problem would be handwritten digit recognition, in which the aim is to assign each input vector to one of a finite number of discrete categories. Another way to think of classification is as a discrete (as opposed to continuous) form of supervised learning where one has a limited number of categories and for each of the n samples provided, one is to try to label them with the correct category or class.
    • regression: if the desired output consists of one or more continuous variables, then the task is called regression. An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight.
  • unsupervised learning, in which the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization (Click here to go to the Scikit-Learn unsupervised learning page).



地板
xjohansen 发表于 2019-9-13 18:05:53 来自手机
超值资料倾情奉送啊

7
lang20052001 发表于 2019-9-14 07:55:27

8
labour5 发表于 2019-9-14 10:18:19

中秋节快乐!点赞!

9
hzhangchina 发表于 2019-9-14 10:35:36
业界良心!!!

10
tianwk 发表于 2019-9-14 13:34:21
thanks for sharing

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-2-7 19:22