【latex版】水贴 - 第227页

2261楼

oliyiyi 发表于 2019-1-9 16:42:09

A big challenge for applied Bayesian inference is computation: converting the mathematical expression of the posterior distribution into specific inferences or predictions such as the posterior probability that some coefficient is positive, or a 90% predictive interval for some future outcome. For even moderately large or complex problems, such quantities are expressed mathematically in terms of high-dimensional integrals with no closed-form expressions.

2262楼

oliyiyi 发表于 2019-1-9 16:43:19

Over the past fifty years, a series of advances in computational statistics have allowed these intergrals to be computed using approximations and simulations. The simulations use random numbers and are called “Monte Carlo methods,” named after the city in Europe that is famous for its gambling casinos. These methods were originally developed in the 1940s for aiding in large computations for the military, and in the 1980s it became clear how to apply them for general problems in Bayesian inference.

2263楼

oliyiyi 发表于 2019-1-9 16:44:04

So: since the 1970s–1980s, methods have been developed to perform approximate computations for Bayesian inferences that would otherwise require intractable intervals. These approximations needed to be developed one model at a time. In the 1990s–2000s, the WinBugs software was developed, which allowed automatic computation for a large class of Bayesian models. WinBugs (and its successors, OpenBugs and Jags) can be slow, and starting in 2011 we developed Stan, which uses more efficient computations (Hamilton Monte Carlo, the no-U-turn sampler, and algorithmic autodifferentiation) so that automatic Bayesian computation can be applied to larger and more complex problems.

2264楼

oliyiyi 发表于 2019-1-9 16:44:42

Where we stand now is that, for a fairly broad class of models and data of moderate size, we can transparently program our Bayesian models in Stan and perform inference automatically. This represent the culmination of decades of work in computational statistics, along with corresponding decades of experience fitting and understanding these models. The challenge is not just fitting the model; it is also deciding what models to fit.

2265楼

oliyiyi 发表于 2019-1-9 16:45:14

Future work, by ourselves and others, will increase the speed and scalability of Stan in various ways, including more seamless implantation of parallel processing.

2266楼

oliyiyi 发表于 2019-1-9 16:46:05

Appealing features of Bayesian inference
Here are some reasons we like to use Bayesian methods:

Integration of data and prior information

Quantification of uncertainty, including probabilistic predictions

Ability to pipe inference directly into decision analysis

Ability to handle uncertainty in large numbers of parameters

It is said that the most important aspect of a statistical analysis is not what you do with the data, it’s what data you use. A key advantage of modern statistical methods (including Bayesian methods but also various non-Bayesian or semi-Bayesian approaches in machine learning) is that they allow you to incorporate different sorts of information into your analysis.

2267楼

oliyiyi 发表于 2019-1-9 16:47:04

Some things that Bayesian inference and Stan can’t do
Bayesian inference does not solve all statistical problems, though. One important class of problems where it is not currently possible to perform fully Bayesian inference is nonlinear classification and optimization with large datasets: familiar examples include language processing, speech and image recognition, and those computer programs that play Go or ping-pong. These problems are often attacked using Bayesian models, but the inferences used are typically only rough approximations to the mathematical Bayesian posterior distribution: the required calculations are simply too involved, and the posterior distributions tend to be multimodal and essentially impossible to fully navigate using any existing algorithm. Stan is not the best tool for these problems. We do think, however, that Stan is the best tool for fitting continuous-parameter models that arise in many application areas, including astronomy, ecology, economic forecasting, earth science, insurance, public health, survey sampling, to just name a few. A wide-ranging set of case studies is available on the Stan website at: http://mc-stan.org/users/documentation/case-studies and in the conference proceedings from every StanCon: https://github.com/stan-dev/stancon_talks.

2268楼

oliyiyi 发表于 2019-1-12 13:37:03

There are many ways to read data using R. We only give two examples, direct assignment and reading csv files. However, another way deserves a brief mention. It is common to come across data that is organized in flat files and delimited at preset locations on each line. This is often called a “fixed width file.”

The command to deal with these kind of files is read.fwf. Examples of how to use this command are not explored here, but a brief example is given. If you would like more information on how to use this command enter the following command:

2269楼

oliyiyi 发表于 2019-1-12 13:38:00

The read.fwf command requires at least two options. The first is the name of the file and the second is a list of numbers that gives the length of each column in the data file. A negative number in the list indicates that the column should be skipped. Here we give the command to read the data file fixedWidth.dat . In this data file there are three columns. The first colum is 17 characters wide, the second column is 15 characters wide, and the last column is 7 characters wide. In the example below we use the optional col.names option to specify the names of the columns:

2270楼

oliyiyi 发表于 2019-1-12 13:38:27

The read.fwf command requires at least two options. The first is the name of the file and the second is a list of numbers that gives the length of each column in the data file. A negative number in the list indicates that the column should be skipped. Here we give the command to read the data file fixedWidth.dat . In this data file there are three columns. The first colum is 17 characters wide, the second column is 15 characters wide, and the last column is 7 characters wide. In the example below we use the optional col.names option to specify the names of the columns:

【latex版】水贴 [推广有奖]

浏览过的帖子

浏览过的版块

初级学术勋章

初级热心勋章

初级信用勋章

中级信用勋章

中级学术勋章

中级热心勋章

高级热心勋章

高级学术勋章

高级信用勋章

特级热心勋章

特级学术勋章

特级信用勋章

本版微信群