楼主: oliyiyi
69235 2410

【latex版】水贴   [推广有奖]

2261
oliyiyi 发表于 2019-1-9 16:42:09 |只看作者 |坛友微信交流群
A big challenge for applied Bayesian inference is computation: converting the mathematical expression of the posterior distribution into specific inferences or predictions such as the posterior probability that some coefficient is positive, or a 90% predictive interval for some future outcome. For even moderately large or complex problems, such quantities are expressed mathematically in terms of high-dimensional integrals with no closed-form expressions.

使用道具

2262
oliyiyi 发表于 2019-1-9 16:43:19 |只看作者 |坛友微信交流群
Over the past fifty years, a series of advances in computational statistics have allowed these intergrals to be computed using approximations and simulations. The simulations use random numbers and are called “Monte Carlo methods,” named after the city in Europe that is famous for its gambling casinos. These methods were originally developed in the 1940s for aiding in large computations for the military, and in the 1980s it became clear how to apply them for general problems in Bayesian inference.

使用道具

2263
oliyiyi 发表于 2019-1-9 16:44:04 |只看作者 |坛友微信交流群
So: since the 1970s–1980s, methods have been developed to perform approximate computations for Bayesian inferences that would otherwise require intractable intervals. These approximations needed to be developed one model at a time. In the 1990s–2000s, the WinBugs software was developed, which allowed automatic computation for a large class of Bayesian models. WinBugs (and its successors, OpenBugs and Jags) can be slow, and starting in 2011 we developed Stan, which uses more efficient computations (Hamilton Monte Carlo, the no-U-turn sampler, and algorithmic autodifferentiation) so that automatic Bayesian computation can be applied to larger and more complex problems.

使用道具

2264
oliyiyi 发表于 2019-1-9 16:44:42 |只看作者 |坛友微信交流群
Where we stand now is that, for a fairly broad class of models and data of moderate size, we can transparently program our Bayesian models in Stan and perform inference automatically. This represent the culmination of decades of work in computational statistics, along with corresponding decades of experience fitting and understanding these models. The challenge is not just fitting the model; it is also deciding what models to fit.

使用道具

2265
oliyiyi 发表于 2019-1-9 16:45:14 |只看作者 |坛友微信交流群
Future work, by ourselves and others, will increase the speed and scalability of Stan in various ways, including more seamless implantation of parallel processing.

使用道具

2266
oliyiyi 发表于 2019-1-9 16:46:05 |只看作者 |坛友微信交流群
Appealing features of Bayesian inference
Here are some reasons we like to use Bayesian methods:

Integration of data and prior information

Quantification of uncertainty, including probabilistic predictions

Ability to pipe inference directly into decision analysis

Ability to handle uncertainty in large numbers of parameters

It is said that the most important aspect of a statistical analysis is not what you do with the data, it’s what data you use. A key advantage of modern statistical methods (including Bayesian methods but also various non-Bayesian or semi-Bayesian approaches in machine learning) is that they allow you to incorporate different sorts of information into your analysis.

使用道具

2267
oliyiyi 发表于 2019-1-9 16:47:04 |只看作者 |坛友微信交流群
Some things that Bayesian inference and Stan can’t do
Bayesian inference does not solve all statistical problems, though. One important class of problems where it is not currently possible to perform fully Bayesian inference is nonlinear classification and optimization with large datasets: familiar examples include language processing, speech and image recognition, and those computer programs that play Go or ping-pong. These problems are often attacked using Bayesian models, but the inferences used are typically only rough approximations to the mathematical Bayesian posterior distribution: the required calculations are simply too involved, and the posterior distributions tend to be multimodal and essentially impossible to fully navigate using any existing algorithm. Stan is not the best tool for these problems. We do think, however, that Stan is the best tool for fitting continuous-parameter models that arise in many application areas, including astronomy, ecology, economic forecasting, earth science, insurance, public health, survey sampling, to just name a few. A wide-ranging set of case studies is available on the Stan website at: http://mc-stan.org/users/documentation/case-studies and in the conference proceedings from every StanCon: https://github.com/stan-dev/stancon_talks.

使用道具

2268
oliyiyi 发表于 2019-1-12 13:37:03 |只看作者 |坛友微信交流群
There are many ways to read data using R. We only give two examples, direct assignment and reading csv files. However, another way deserves a brief mention. It is common to come across data that is organized in flat files and delimited at preset locations on each line. This is often called a “fixed width file.”

The command to deal with these kind of files is read.fwf. Examples of how to use this command are not explored here, but a brief example is given. If you would like more information on how to use this command enter the following command:

使用道具

2269
oliyiyi 发表于 2019-1-12 13:38:00 |只看作者 |坛友微信交流群
The read.fwf command requires at least two options. The first is the name of the file and the second is a list of numbers that gives the length of each column in the data file. A negative number in the list indicates that the column should be skipped. Here we give the command to read the data file fixedWidth.dat . In this data file there are three columns. The first colum is 17 characters wide, the second column is 15 characters wide, and the last column is 7 characters wide. In the example below we use the optional col.names option to specify the names of the columns:

使用道具

2270
oliyiyi 发表于 2019-1-12 13:38:27 |只看作者 |坛友微信交流群
The read.fwf command requires at least two options. The first is the name of the file and the second is a list of numbers that gives the length of each column in the data file. A negative number in the list indicates that the column should be skipped. Here we give the command to read the data file fixedWidth.dat . In this data file there are three columns. The first colum is 17 characters wide, the second column is 15 characters wide, and the last column is 7 characters wide. In the example below we use the optional col.names option to specify the names of the columns:

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-24 16:47