楼主: ReneeBK
1899 9

【Programming using PyMC3】Doing Bayesian Data Analysis by John K. Kruschke [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.6937
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2016-12-11 11:04:26 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

Doing Bayesian Data Analysis-master.zip (11.59 MB, 需要: 5 个论坛币)


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Programming Bayesian Analysis Kruschke Program

本帖被以下文库推荐

沙发
ReneeBK(未真实交易用户) 发表于 2016-12-11 11:04:47
  1. """
  2. Goal: Toss a coin N times and compute the running proportion of heads.
  3. """
  4. import matplotlib.pyplot as plt
  5. import numpy as np

  6. # Specify the total number of flips, denoted N.
  7. N = 500
  8. # Generate a random sample of N flips for a fair coin (heads=1, tails=0);
  9. np.random.seed(47405)
  10. flip_sequence = np.random.choice(a=(0, 1), p=(.5, .5), size=N, replace=True)
  11. # Compute the running proportion of heads:
  12. r = np.cumsum(flip_sequence)
  13. n = np.linspace(1, N, N)  # n is a vector.
  14. run_prop = r/n  # component by component division.

  15. # Graph the running proportion:
  16. plt.plot(n, run_prop, '-o', )
  17. plt.xscale('log')  # an alternative to plot() and xscale() is semilogx()
  18. plt.xlim(1, N)
  19. plt.ylim(0, 1)
  20. plt.xlabel('Flip Number')
  21. plt.ylabel('Proportion Heads')
  22. plt.title('Running Proportion of Heads')
  23. # Plot a dotted horizontal line at y=.5, just as a reference line:
  24. plt.axhline(y=.5, ls='dashed')

  25. # Display the beginning of the flip sequence.
  26. flipletters = ''
  27. for i in flip_sequence[:10]:
  28.     if i == 1:
  29.         flipletters += 'H'
  30.     else:
  31.         flipletters += 'T'

  32. plt.text(10, 0.8, 'Flip Sequence = %s...' % flipletters)
  33. # Display the relative frequency at the end of the sequence.
  34. plt.text(25, 0.2, 'End Proportion = %s' % run_prop[-1])

  35. plt.savefig('Figure_3.1.png')
复制代码

藤椅
ReneeBK(未真实交易用户) 发表于 2016-12-11 11:07:23
  1. """
  2. Inferring a binomial proportion via exact mathematical analysis.
  3. """
  4. import sys
  5. import numpy as np
  6. from scipy.stats import beta
  7. from scipy.special import beta as beta_func
  8. import matplotlib.pyplot as plt
  9. from HDIofICDF import *


  10. def bern_beta(prior_shape, data_vec, cred_mass=0.95):
  11.     """Bayesian updating for Bernoulli likelihood and beta prior.
  12.      Input arguments:
  13.        prior_shape
  14.          vector of parameter values for the prior beta distribution.
  15.        data_vec
  16.          vector of 1's and 0's.
  17.        cred_mass
  18.          the probability mass of the HDI.
  19.      Output:
  20.        post_shape
  21.          vector of parameter values for the posterior beta distribution.
  22.      Graphics:
  23.        Creates a three-panel graph of prior, likelihood, and posterior
  24.        with highest posterior density interval.
  25.      Example of use:
  26.      post_shape = bern_beta(prior_shape=[1,1] , data_vec=[1,0,0,1,1])"""

  27.     # Check for errors in input arguments:
  28.     if len(prior_shape) != 2:
  29.         sys.exit('prior_shape must have two components.')
  30.     if any([i < 0 for i in prior_shape]):
  31.         sys.exit('prior_shape components must be positive.')
  32.     if any([i != 0 and i != 1 for i in data_vec]):
  33.         sys.exit('data_vec must be a vector of 1s and 0s.')
  34.     if cred_mass <= 0 or cred_mass >= 1.0:
  35.         sys.exit('cred_mass must be between 0 and 1.')

  36.     # Rename the prior shape parameters, for convenience:
  37.     a = prior_shape[0]
  38.     b = prior_shape[1]
  39.     # Create summary values of the data:
  40.     z = sum(data_vec[data_vec == 1])  # number of 1's in data_vec
  41.     N = len(data_vec)   # number of flips in data_vec
  42.     # Compute the posterior shape parameters:
  43.     post_shape = [a+z, b+N-z]
  44.     # Compute the evidence, p(D):
  45.     p_data = beta_func(z+a, N-z+b)/beta_func(a, b)
  46.     # Construct grid of theta values, used for graphing.
  47.     bin_width = 0.005  # Arbitrary small value for comb on theta.
  48.     theta = np.arange(bin_width/2, 1-(bin_width/2)+bin_width, bin_width)
  49.     # Compute the prior at each value of theta.
  50.     p_theta = beta.pdf(theta, a, b)
  51.     # Compute the likelihood of the data at each value of theta.
  52.     p_data_given_theta = theta**z * (1-theta)**(N-z)
  53.     # Compute the posterior at each value of theta.
  54.     post_a = a + z
  55.     post_b = b+N-z
  56.     p_theta_given_data = beta.pdf(theta, a+z, b+N-z)
  57.     # Determine the limits of the highest density interval
  58.     intervals = HDIofICDF(beta, cred_mass, a=post_shape[0], b=post_shape[1])

  59.     # Plot the results.
  60.     plt.figure(figsize=(12, 12))
  61.     plt.subplots_adjust(hspace=0.7)

  62.     # Plot the prior.
  63.     locx = 0.05
  64.     plt.subplot(3, 1, 1)
  65.     plt.plot(theta, p_theta)
  66.     plt.xlim(0, 1)
  67.     plt.ylim(0, np.max(p_theta)*1.2)
  68.     plt.xlabel(r'$\theta$')
  69.     plt.ylabel(r'$P(\theta)$')
  70.     plt.title('Prior')
  71.     plt.text(locx, np.max(p_theta)/2, r'beta($\theta$;%s,%s)' % (a, b))
  72.     # Plot the likelihood:
  73.     plt.subplot(3, 1, 2)
  74.     plt.plot(theta, p_data_given_theta)
  75.     plt.xlim(0, 1)
  76.     plt.ylim(0, np.max(p_data_given_theta)*1.2)
  77.     plt.xlabel(r'$\theta$')
  78.     plt.ylabel(r'$P(D|\theta)$')
  79.     plt.title('Likelihood')
  80.     plt.text(locx, np.max(p_data_given_theta)/2, 'Data: z=%s, N=%s' % (z, N))
  81.     # Plot the posterior:
  82.     plt.subplot(3, 1, 3)
  83.     plt.plot(theta, p_theta_given_data)
  84.     plt.xlim(0, 1)
  85.     plt.ylim(0, np.max(p_theta_given_data)*1.2)
  86.     plt.xlabel(r'$\theta$')
  87.     plt.ylabel(r'$P(\theta|D)$')
  88.     plt.title('Posterior')
  89.     locy = np.linspace(0, np.max(p_theta_given_data), 5)
  90.     plt.text(locx, locy[1], r'beta($\theta$;%s,%s)' % (post_a, post_b))
  91.     plt.text(locx, locy[2], 'P(D) = %g' % p_data)
  92.     # Plot the HDI
  93.     plt.text(locx, locy[3],
  94.              'Intervals = %.3f - %.3f' % (intervals[0], intervals[1]))
  95.     plt.fill_between(theta, 0, p_theta_given_data,
  96.                     where=np.logical_and(theta > intervals[0],
  97.                     theta < intervals[1]),
  98.                         color='blue', alpha=0.3)
  99.     return intervals

  100. data_vec = np.repeat([1, 0], [11, 3])  # 11 heads, 3 tail
  101. intervals = bern_beta(prior_shape=[100, 100], data_vec=data_vec)
  102. plt.savefig('Figure_5.2.png')
  103. plt.show()
复制代码

板凳
ReneeBK(未真实交易用户) 发表于 2016-12-11 11:08:50
  1. """
  2. Posterior predictive check. Examine the veracity of the winning model by
  3. simulating data sampled from the winning model and see if the simulated data
  4. 'look like' the actual data.
  5. """
  6. import numpy as np
  7. from scipy.stats import beta
  8. import matplotlib.pyplot as plt

  9. # Specify known values of prior and actual data.
  10. prior_a = 100
  11. prior_b = 1
  12. actual_data_Z = 8
  13. actual_data_N = 12
  14. # Compute posterior parameter values.
  15. post_a = prior_a + actual_data_Z
  16. post_b = prior_b + actual_data_N - actual_data_Z
  17. # Number of flips in a simulated sample should match the actual sample size:
  18. sim_sample_size = actual_data_N
  19. # Designate an arbitrarily large number of simulated samples.
  20. n_sim_samples = 1000
  21. # Set aside a vector in which to store the simulation results.
  22. sim_sample_Z_record = np.zeros(n_sim_samples)
  23. # Now generate samples from the posterior.
  24. for sample_idx in range(0, n_sim_samples):
  25.     # Generate a theta value for the new sample from the posterior.
  26.     sample_theta = beta.rvs(post_a, post_b)
  27.     # Generate a sample, using sample_theta.
  28.     sample_data = np.random.choice([0, 1], p=[1-sample_theta, sample_theta],
  29.                                   size=sim_sample_size, replace=True)
  30.     sim_sample_Z_record[sample_idx] = sum(sample_data)


  31. ## Make a histogram of the number of heads in the samples.
  32. plt.hist(sim_sample_Z_record)
  33. plt.show()
复制代码

报纸
ReneeBK(未真实交易用户) 发表于 2016-12-11 11:10:40
  1. """
  2. Use this program as a template for experimenting with the Metropolis algorithm
  3. applied to 2 parameters called theta1,theta2 defined on the domain [0,1]x[0,1].
  4. """
  5. from __future__ import division
  6. import numpy as np
  7. from scipy.stats import beta
  8. import matplotlib.pyplot as plt


  9. # Define the likelihood function.
  10. # The input argument is a vector: theta = [theta1 , theta2]

  11. def likelihood(theta):
  12.     # Data are constants, specified here:
  13.     z1, N1, z2, N2 = 5, 7, 2, 7
  14.     likelihood = (theta[0]**z1 * (1-theta[0])**(N1-z1)
  15.                  * theta[1]**z2 * (1-theta[1])**(N2-z2))
  16.     return likelihood


  17. # Define the prior density function.
  18. # The input argument is a vector: theta = [theta1 , theta2]
  19. def prior(theta):
  20.     # Here's a beta-beta prior:
  21.     a1, b1, a2, b2 = 3, 3, 3, 3
  22.     prior = beta.pdf(theta[0], a1, b1) * beta.pdf(theta[1], a2, b2)
  23.     return prior


  24. # Define the relative probability of the target distribution, as a function
  25. # of theta.  The input argument is a vector: theta = [theta1 , theta2].
  26. # For our purposes, the value returned is the UNnormalized posterior prob.
  27. def target_rel_prob(theta):
  28.     if ((theta >= 0.0).all() & (theta <= 1.0).all()):
  29.         target_rel_probVal =  likelihood(theta) * prior(theta)
  30.     else:
  31.         # This part is important so that the Metropolis algorithm
  32.         # never accepts a jump to an invalid parameter value.
  33.         target_rel_probVal = 0.0
  34.     return target_rel_probVal
  35. #    if ( all( theta >= 0.0 ) & all( theta <= 1.0 ) ) {
  36. #        target_rel_probVal =  likelihood( theta ) * prior( theta )


  37. # Specify the length of the trajectory, i.e., the number of jumps to try:
  38. traj_length = 5000 # arbitrary large number
  39. # Initialize the vector that will store the results.
  40. trajectory = np.zeros((traj_length, 2))
  41. # Specify where to start the trajectory
  42. trajectory[0, ] = [0.50, 0.50] # arbitrary start values of the two param's
  43. # Specify the burn-in period.
  44. burn_in = np.ceil(.1 * traj_length) # arbitrary number
  45. # Initialize accepted, rejected counters, just to monitor performance.
  46. n_accepted = 0
  47. n_rejected = 0
  48. # Specify the seed, so the trajectory can be reproduced.
  49. np.random.seed(47405)
  50. # Specify the covariance matrix for multivariate normal proposal distribution.
  51. n_dim, sd1, sd2 = 2, 0.2, 0.2
  52. covar_mat = [[sd1**2, 0], [0, sd2**2]]

  53. # Now generate the random walk. step is the step in the walk.
  54. for step in range(traj_length-1):
  55.     current_position = trajectory[step, ]
  56.     # Use the proposal distribution to generate a proposed jump.
  57.     # The shape and variance of the proposal distribution can be changed
  58.     # to whatever you think is appropriate for the target distribution.
  59.     proposed_jump = np.random.multivariate_normal(mean=np.zeros((n_dim)),
  60.                                                  cov=covar_mat)
  61.     # Compute the probability of accepting the proposed jump.
  62.     prob_accept = np.minimum(1, target_rel_prob(current_position + proposed_jump)
  63.                             / target_rel_prob(current_position))
  64.     # Generate a random uniform value from the interval [0,1] to
  65.     # decide whether or not to accept the proposed jump.
  66.     if np.random.rand() < prob_accept:
  67.         # accept the proposed jump
  68.         trajectory[step+1, ] = current_position + proposed_jump
  69.         # increment the accepted counter, just to monitor performance
  70.         if step > burn_in:
  71.             n_accepted += 1
  72.     else:
  73.         # reject the proposed jump, stay at current position
  74.         trajectory[step+1, ] = current_position
  75.         # increment the rejected counter, just to monitor performance
  76.         if step > burn_in:
  77.             n_rejected += 1

  78. # End of Metropolis algorithm.

  79. #-----------------------------------------------------------------------
  80. # Begin making inferences by using the sample generated by the
  81. # Metropolis algorithm.

  82. # Extract just the post-burnIn portion of the trajectory.
  83. accepted_traj = trajectory[burn_in:]

  84. # Compute the means of the accepted points.
  85. mean_traj =  np.mean(accepted_traj, axis=0)
  86. # Compute the standard deviations of the accepted points.
  87. stdTraj =  np.std(accepted_traj, axis=0)

  88. # Plot the trajectory of the last 500 sampled values.
  89. plt.plot(accepted_traj[:,0], accepted_traj[:,1], marker='o', alpha=0.3)
  90. plt.xlim(0, 1)
  91. plt.ylim(0, 1)
  92. plt.xlabel(r'$\theta1$')
  93. plt.ylabel(r'$\theta2$')

  94. # Display means in plot.
  95. plt.plot(0, label='M = %.3f, %.3f' % (mean_traj[0], mean_traj[1]), alpha=0.0)
  96. # Display rejected/accepted ratio in the plot.
  97. plt.plot(0, label=r'$N_{pro}=%s$ $\frac{N_{acc}}{N_{pro}} = %.3f$' % (len(accepted_traj), (n_accepted/len(accepted_traj))), alpha=0)

  98. # Evidence for model, p(D).
  99. # Compute a,b parameters for beta distribution that has the same mean
  100. # and stdev as the sample from the posterior. This is a useful choice
  101. # when the likelihood function is binomial.
  102. a =   mean_traj * ((mean_traj*(1-mean_traj)/stdTraj**2) - np.ones(n_dim))
  103. b = (1-mean_traj) * ( (mean_traj*(1-mean_traj)/stdTraj**2) - np.ones(n_dim))
  104. # For every theta value in the posterior sample, compute
  105. # beta.pdf(theta, a, b) / likelihood(theta) * prior(theta)
  106. # This computation assumes that likelihood and prior are properly normalized,
  107. # i.e., not just relative probabilities.

  108. wtd_evid = np.zeros(np.shape(accepted_traj)[0])
  109. for idx in range(np.shape(accepted_traj)[0]):
  110.     wtd_evid[idx] = (beta.pdf(accepted_traj[idx,0],a[0],b[0] )
  111.         * beta.pdf(accepted_traj[idx,1],a[1],b[1]) /
  112.         (likelihood(accepted_traj[idx,]) * prior(accepted_traj[idx,])))

  113. p_data = 1 / np.mean(wtd_evid)
  114. # Display p(D) in the graph
  115. plt.plot(0, label='p(D) = %.3e' % p_data, alpha=0)
  116. plt.legend(loc='upper left')
  117. plt.savefig('Figure_8.3.png')

  118. # Estimate highest density region by evaluating posterior at each point.
  119. accepted_traj = trajectory[burn_in:]
  120. npts = np.shape(accepted_traj)[0]
  121. post_prob = np.zeros((npts))
  122. for ptIdx in range(npts):
  123.     post_prob[ptIdx] = target_rel_prob(accepted_traj[ptIdx,])

  124. # Determine the level at which credmass points are above:
  125. credmass = 0.95
  126. waterline = np.percentile(post_prob, (credmass))

  127. HDI_points = accepted_traj[post_prob > waterline, ]

  128. plt.figure()
  129. plt.plot(HDI_points[:,0], HDI_points[:,1], 'ro')
  130. plt.xlim(0,1)
  131. plt.ylim(0,1)
  132. plt.xlabel(r'$\theta1$')
  133. plt.ylabel(r'$\theta2$')

  134. # Display means in plot.
  135. plt.plot(0, label='M = %.3f, %.3f' % (mean_traj[0], mean_traj[1]), alpha=0.0)
  136. # Display rejected/accepted ratio in the plot.
  137. plt.plot(0, label=r'$N_{pro}=%s$ $\frac{N_{acc}}{N_{pro}} = %.3f$' % (len(accepted_traj), (n_accepted/len(accepted_traj))), alpha=0)
  138. # Display p(D) in the graph
  139. plt.plot(0, label='p(D) = %.3e' % p_data, alpha=0)
  140. plt.legend(loc='upper left')

  141. plt.savefig('Figure_8.3_HDI.png')

  142. plt.show()
复制代码

地板
soccy(未真实交易用户) 发表于 2016-12-11 18:49:02
......

7
franky_sas(未真实交易用户) 发表于 2016-12-12 12:04:48

8
terryzhao1(真实交易用户) 在职认证  发表于 2016-12-24 17:11:20
好东西,可以看一看

9
terryzhao1(真实交易用户) 在职认证  发表于 2016-12-24 17:11:21
好东西,可以看一看。

10
restalker(真实交易用户) 发表于 2016-12-26 17:28:25
回复过了

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-31 02:50