Covariate Selection for Small Area Estimation in Repeated Sample Surveys
Languages of publication
If the implementation of small area estimation methods to multiple editions of a repeated sample survey is considered, then the question arises which covariates to use in the models. Applying standard model selection procedures independently to the different editions of the survey may identify different sets of covariates for each edition. If the small area predictions are sensitive to the different models, this is undesirable in official statistics since monitoring change over time of statistical quantities is of utmost importance. Therefore, potential confounding of true change and methodological alterations should be avoided. An approach to model selection is proposed resulting in a single set of covariates for multiple survey editions. This is achieved through conducting covariate selection simultaneously for all editions, minimizing the average of the edition-specific conditional Akaike Information Criteria. Consecutive editions of the Dutch crime victimization survey are used as a case study. Municipal estimates of three survey variables are obtained using area level models. The proposed averaging strategy is compared to the standard method of considering each edition separately, and to an elementary approach using covariates selected in the first edition. Resulting models, point estimates and MSE estimates are analyzed, indicating no substantial adverse effects of the conceptually attractive averaging strategy.
- BATTESE, G.E., HARTER, R.M., FULLER, W.A., (1988). An error components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83, 28–36.
- BELL, W.R., (1999). Accounting for uncertainty about variances in small area estimation. Technical report. Bulletin of the International Statistical Institute.
- BOONSTRA, H.J., (2012). hbsae: Hierarchical Bayesian Small Area Estimation, Manual R package version 1.0.. Statistics Netherlands, Heerlen.
- BOONSTRA, H.J., VAN DEN BRAKEL, J.A., BUELENS, B., KRIEG, S., SMEETS, M., (2008). Towards small area estimation at Statistics Netherlands.METRON International Journal of Statistics, LXVI, 21–49.
- BUELENS, B., VAN DEN BRAKEL, J.A., (2014). Model selection for small area estimation in repeated surveys. Discussion paper 201423, Statistics Netherlands, Heerlen. http://www.cbs.nl/NR/rdonlyres/308ED398- 714A-41A4-A57 C-9DCCC3F30D35/0/201423x10pub.pdf
- BUELENS, B., VAN DEN BRAKEL, J.A., (2015). Measurement error calibration in mixed mode surveys. Sociological Methods and Research, 44, 391– 426.
- CLAESKENS, G., HJORT, N.L., (2008). Model selection and model averaging, Cambridge series on statistical and probabilistic mathematics, Cambridge University Press.
- DATTA, G.S., LAHIRI, P., MAITI, T., LU, K.L., (1999). Hierarchical Bayes estimation of unemployement rates for the states of the U.S.. Journal of the American Statistical Association, 94, 1074–1082.
- FAY, R.E., HERRIOT, R.A., (1979). Estimation of income for small places: an application of James-Stein procedures to census data. Journal of the American Statistical Association, 74, 268–277.
- HODGES, J.S., SARGENT, D.J., (2001). Counting degrees of freedom in hierarchical and other richly parameterized models. Biometrika, 88, 367–379.
- HORVITZ, D.G., THOMPSON, D.J., (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–685.
- LAHIRI, P., SUNTORNCHOST, J., (2015). Variable selection for linear mixed models with applications in small area estimation. Sankhya B, 1–9, doi = 10.1007/s13571-015-0096-0.
- NARAIN, R., (1951). On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Statistics, 3, 581–613.
- PFEFFERMANN, D., (2013). New important developments in small area estimation. Statistical Science, 28, 40–68.
- PFEFFERMANN, D., TILLER, R., (2006). Small Area Estimation with State-Space Models subject to Benchmark Constraints. Journal of the American Statistical Association, 101, 1387–1397.
- R DEVELOPMENT CORE TEAM, (2009). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org.
- RAO, J.N.K., (2003). Small Area Estimation, New York: John Wiley.
- RAO, J.N.K., YU, M., (1994). Small-area estimation by combining time-series and cross-sectional data. The Canadian Journal of Statistics, 22, 511-528.
- SÄRNDAL, C-E., SWENSSON, B., WRETMAN, J., (1992). Model Assisted Survey Sampling, New York: Springer.
- SCHOUTEN, B., VAN DEN BRAKEL, J.A., BUELENS, B., VAN DER LAAN, J., KLAUSCH, T., (2013). Disentangling mode-specific selection bias and measurement bias in social surveys. Social Science Research, 42, 1555-1570.
- VAIDA, F., BLANCHARD, S., (2005). Conditional Akaike information for mixedeffects models. Biometrika, 92, 351–370.
- YOU, Y., ZHOU, Q., (2011). Hierarchical Bayes small area estimation under a spatial model with application to health survey data. Survey Methodology, 37, 25–36.
Publication order reference