
Biostatistics 276
Spring 2005 - John Boscardin

Last Update: 06/02/05

Announcements
- Please let me have your final projects by Friday, June 17.
A formal writeup is not necessary, but please give results of analyzing
a data set of interest.
Schedule
Class meets Monday 1 to 3 and Thursday 1 to 3 in CHS 61-235.
Office Hours
Tuesdays 10-12 or by appointment.
Reading List
Books
- (Required) Robert, CP, Casella, G (2004).
Monte Carlo Statistical Methods, Second Edition.
New York: Springer.
- (Optional) Liu, JS (2001).
Monte Carlo Strategies for Scientific Computing.
New York: Springer.
- (Optional) Gamerman, D (1997). Markov Chain Monte Carlo.
New York: CRC.
- (Optional) Chen, MH, Shao, QM, Ibrahim, JG (2000).
Monte Carlo Methods in Bayesian Computation.
New York: Springer.
- (Optional) Gelman, A, Carlin, JB, Stern, HS, and Rubin, DB
(2003). Bayesian Data Analysis, Second Edition.
New York: Chapman and Hall.
- (Optional) Carlin, BP, Louis, TA (2000). Bayes and Empirical
Bayes Methods for Data Analysis, Second Edition. New York:
Chapman and Hall.
- (Optional) Congdon, P (2001). Bayesian Statistical Modelling.
New York: Wiley.
Manuscripts (JSTOR links will only work from UCLA IP addresses)
- Lindley, DV, and Smith, AFM, "Bayes Estimates for the Linear Model".
Journal of the Royal Statistical Society. Series B (Methodological), Vol. 34, No. 1. (1972), pp. 1-41.
Get
it from JSTOR (basics for Bayesian hierarchical models).
- Tanner, MA, and Wong, WH, "The Calculation of Posterior Distributions by Data Augmentation".
Journal of the American Statistical Association, Vol. 82,
No. 398. (Jun., 1987), pp. 528-540.
Get
it from JSTOR (demonstration that two block Gibbs sampling works).
- Gelfand, AE, and Smith, AFM, "Sampling-Based Approaches to Calculating Marginal Densities".
Journal of the American Statistical Association, Vol. 85,
No. 410. (Jun., 1990), pp. 398-409.
Get
it from JSTOR (one of the most cited papers in statistics).
- Gelfand, AE, Hills, SE, Racine-Poon, A, and Smith, AFM,
"Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling".
Journal of the American Statistical Association, Vol. 85,
No. 412. (Dec., 1990), pp. 972-985.
Get
it from JSTOR (good example list).
- Andrews, DF and Mallows, CL,
"Scale Mixtures of Normal Distributions".
JRSS B, Vol. 36, No. 1. (1974), pp. 99-102.
Get
it from JSTOR.
- Carlin, BP, Polson, NG, and Stoffer, DS,
"A Monte-Carlo Approach to Nonnormal and Nonlinear State-Space
Modeling".
JASA, Vol. 87, No. 418. (1992), pp. 493-500.
Get
it from JSTOR.
- Dempster, AP, Laird, NM, Rubin, DB,
"Maximum Likelihood from Incomplete Data via the EM Algorithm".
JRSS B, Vol. 39, No. 1. (1977), pp. 1-38.
Get
it from JSTOR.
- Albert, JH, Chib, S,
"Bayesian Analysis of Binary and Polychotomous Response Data".
JASA, Vol. 88, No. 422. (1993), pp. 669-679.
Get
it from JSTOR.
- Gilks, WR, Wild, P,
"Adaptive rejection sampling for Gibbs sampling".
Applied Statistics, Vol. 41 (1992), pp. 337-348.
Get
it from JSTOR.
- Gilks, WR,
"Derivative-free adaptive rejection sampling for Gibbs sampling".
in Bayesian Statistics 4 (1992), Bernardo et al. (editors).
- Gilks, WR, Best, NG, Tan, KKC,
"Adaptive rejection Metropolis sampling within Gibbs sampling".
Applied Statistics, Vol. 44 (1995), pp. 455-472.
Get
it from JSTOR.
- Scott, SL, "Bayesian methods for hidden Markov models: Recursive
computing in the 21st century".
JASA, Vol. 97 (2002), pp. 337-351.
Get
it from ISI Web of Science (search topic "bayesian" and author "scott sl").
- Carter, CK and Kohn, R, "On Gibbs sampling for state space models".
Biometrika, Vol. 81 (1994), pp. 541-553.
Get
it from JSTOR.
- Shephard, N, "Partial non-Gaussian state space".
Biometrika, Vol. 81 (1994), pp. 115-131.
Get
it from JSTOR.
- Chib, S, "Calculating posterior distributions and
modal estimates in Markov mixture models".
Journal of Econometrics, Vol. 75 (1996), pp. 79-97.
Get it from Science Direct.
- Leroux BG, Puterman ML, "Maximum-Penalized-Likelihood Estimation for Independent
and Markov- Dependent Mixture Models". Biometrics, Vol. 48 (1992), pp. 545-558.
Get
it from JSTOR.
- Kohn, R, Ansley, C,
"A new algorithm for spline smoothing based on smoothing a stochastic process".
SIAM Journal on Scientific and Statistical Computing, Vol. 8 (1987), pp. 33-48.
Get
it from SIAM.
- Anderson, S, Jones, R, Swanson, G,
"Smoothing polynomial splines for bivariate data".
SIAM Journal on Scientific and Statistical Computing, Vol. 11 (1990), pp. 749-766.
Get
it from SIAM.
- Wahba, G,
"Bayesian confidence intervals for the cross-validated smoothing spline".
JRSS B, Vol. 45 (1983), pp. 133-150.
Get
it from JSTOR.
- Wecker, W, Ansley, C,
"The signal extraction approach to non-linear regression and spline smoothing".
JASA, Vol. 78, (1983), pp. 81-89.
Get
it from JSTOR.
- Green, PJ, "Reversible jump Markov chain Monte Carlo computation
and Bayesian model determination".
Biometrika, Vol. 82 (1995), pp. 771-732.
Get
it from JSTOR.
- Chib, S and Jeliazkov, I, "Marginal likelihood from the
Metropolis-Hastings output". JASA, Vol. 96 (2001), pp. 270-281.
Get
it from ISI Web of Science (search topic "marginal" and author "chib s*").
- Chib, S "Marginal likelihood from the Gibbs output". JASA, Vol. 90
(1995), pp. 1313-1321.
Get
it from JSTOR.
- Chipman, H, George, E, and McCulloch, R,
"Bayesian CART model search".
JASA, Vol. 93 (1998), pp. 935-948.
Get
it from JSTOR.
- Chipman, H, George, E, and McCulloch, R,
"BART: Bayesian Additive Regression Trees".
Technical Report (2005).
Get
it from Chipman's homepage.
- Chib, S and Greenberg, E,
"Analysis of multivariate probit models".
Biometrika, Vol. 85 (1998), pp. 347-361.
Get
it from JSTOR.
Summary of lectures
- April 4: Basic concepts of Bayesian inference
through simulation. Importance sampling (and renormalized Importance
sampling) for estimating posterior mean and more general summary
statistics. Most material for this lecture is found in Robert and Casella.
- April 7: Accept-reject algorithm. Doing general Bayesian
inference through Accept-Reject (a foreshadowing of Metropolis algorithm,
but not usually useful in practice) from Gamerman's book.
Factorization ideas. Marginal vs. conditional
distributions. Simple example (Bayesian one-way ANOVA or normal means
model) from Gelman et al. book (Chapter 5).
- April 11: Computational details for the normal means
model. All three algorithms presented with
8 schools data
from Gelman et al.
Basic Markov Chain notation. Markov chains are defined by transition kernels.
For well-behaved transition kernels (we will discuss details in a later lecture),
the chain will have (1) a unique stationary measure, and (2) the distribution of
the simulated values will converge to this stationary measure as the chain
is run for more and more iterations. Idea of MCMC: define a chain which
has p(theta|y) as its stationary distribution. Run the chain long enough
and you should get simulations that are pretty close to coming from the posterior
distribution of theta. Idea of Gibbs: the Markov chain defined by drawing
repeatedly from the full conditional posterior density of each parameter block has
the joint posterior density of all parameter blocks as its stationary distribution.
- April 14:
Tanner and Wong (1987) argument that two block Gibbs (data augmentation)
has marginal posterior density as its stationary distribution.
Bayesian regression models. Statistical computing ideas (Cholesky and QR)
for regression.
- April 18:
Reversibility of a Markov chain. Idea that if chain is reversible with
respect to a probability measure, then that measure is also a stationary
distribution. A Markov chain that is irreducible and aperiodic converges to
a unique stationary distribution. Senses of convergence (convergence in distribution,
ergodic convergence of Monte Carlo averages to posterior expectations,
existence of central limit theorem to describe rate of convergence).
Examples of Gibbs sampling settings: (1) scale mixture of normals (Andrews and Mallows, 1978;
Carlin, Polson, and Stoffer, 1992); (2) Poisson changepoint model.
- April 21:
Basic C programming for MCMC algorithms. Poisson changepoint example
(C code) (R/S+ code)
(graphics).
Data completion ideas. Censored data and probit regression examples.
EM algorithm to maximize likelihood (or posterior distribution)
through iterative maximization of the complete data likelihood.
- April 25:
Gibbs for censored data (Robert and Casella text)
and probit regression (Albert and Chib, 1993) examples.
Student t-link with 8 degrees of freedom gives close
approximation to logistic regression.
Metropolis-Hastings algorithm. Proof that is
reversible with respect to the target density.
- April 28:
Using GSL. Rewrite of changepoint program.
This version reads the mining accident data from the file
minedata.txt.
Discussion of programming probit regression in GSL.
Gibbs sampler as a composition of Metropolis-Hastings algorithms.
- May 2:
Hammersley-Clifford theorem. Positivity condition.
Slice sampling and Gibbs sampling.
Pseudodata treatment of normal prior for the regression
parameters. Detail for QR-based regression in
GSL (with informative prior on beta).
Handed out a
short summary of last month's lectures
and a couple of comments on chessboard and Odell-Feiveson problems.
- May 5:
More details on programming the probit model in C
(c code header file)
and in R (R code). Here is a simulated data set to try
(X matrix Y vector).
You might also want to use the Finney data
from Albert and Chib (1993). Rao-Blackwell method for calculating
a marginal posterior density estimate. Gibbs sampling for finite mixture
models.
- May 9:
Hidden Markov models using notation and ideas from Scott (2002).
Forward-backward algorithm for sampling the hidden state vector
directly from its posterior distribution (as opposed to one component
at a time given the two neighboring hidden states).
- May 12:
Forward algorithm decomposed as prediction plus updating steps.
Relation to Kalman filtering.
Non-stochastic (smoothing) versus stochastic backward algorithms.
Programming neighboring Gibbs in Poisson-Gamma HMM model
(C code header file
lamb data of Leroux and Puterman (1992)).
Introduction to state space models.
- May 16:
More details of state space models.
Smoothing spline background information.
Bayesian stochastic process representation
of smoothing spline (Wahba, 1983).
State space version of smoothing spline model
(Wecker and Ansley, 1983; Kohn and Ansley, 1987; Anderson et al., 1990).
Neighboring Gibbs algorithm (Carlin, Polson, and Stoffer, 1992).
Sampling from the joint distribution of the state vector
(Carter and Kohn, 1994; Shephard, 1994).
Kalman filter is the forward
algorithm. Deterministic and stochastic backward algorithms.
- May 19
Convergence diagnostics. Convergence in three senses:
(i) to stationary distribution, (ii) of averages; (iii) of thinned
samples to iid. Multiple chain MCMC and Gelman-Rubin statistics.
"Ideal" acceptance rates for MH. Material for this lecture is
all contained or referenced in chapter 12 of RC.
- May 23
Reversible jump MCMC (Green, 1995). Example of changepoint
model: piecewise constant intensity
for a non-homogeneous Poisson process.
- May 26
Computing Bayes factors. Chib (1995) for Gibbs
sampling setting. Chib and Jeliazkov (2001) for
M-H setting.
- June 2
Bayesian CART and Bayesian random forests (BART).
References to Chipman, George, and McCulloch (1998)
and Chipman, George, and McCulloch (2005).
- June 6
Missing data basics. Multivariate normal model (Schafer's book
and SAS Proc MI manual), multivariate probit model (Chib and Greenberg, 1998).
- June 9
Modeling and sampling the covariance and correlation matrices for
the MVN and MVP models.
Useful Web Sites
Course Handouts
Send me some email:
jbosco@ucla.edu.
Go to my Home Page.