On the Bayesian Analysis of Population Size

Ruth King and Stephen P. Brooks

University of Bristol and University of Cambridge, England

Summary

In this paper we consider the problem of estimating the total size of a population from a series of incomplete census data. We observe that inference is typically highly sensitive to the choice of model and we demonstrate how Bayesian model averaging techniques easily overcome this problem. We combine and extend the work of Madigan and York (1997) and Dellaportas and Forster (1999) using reversible jump MCMC simulation to calculate posterior model probabilities which can then be used to estimate model averaged statistics of interest. We provide a detailed description of the simulation procedures involved and consider a wide variety of modelling issues, such as the range of models considered, their parameterisation, both prior choice and sensitivity, and computational efficiency. We consider a detailed example concerning adolescent injuries in Pennsylvania on the basis of medical, school and survey data. In the context of this example, we discuss the relationship between posterior model probabilities and the associated information criteria values for model selection. We also discuss cost-efficiency issues with particular reference to inclusion and exclusion of sources on the grounds of cost. We consider a decision theoretic approach, which balances the cost and accuracy of different combinations of data sources to guide future decisions on data collection.

Keywords:

Contingency table; Unobserved data; Log-linear models; Markov chain Monte Carlo; Reversible jump MCMC; Posterior model probabilities; Decision theory; Cost-effectiveness.

Appeared as King, R. and Brooks, S.P. (2001) "On the Bayesian Analysis of Population Size". Biometrika 88 pp317--336.