Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Results found: 32

first rewind previous Page / 2 next fast forward last

Search results

Search:
in the keywords:  small area estimation
help Sort By:

help Limit search:
first rewind previous Page / 2 next fast forward last
EN
This article asserts that diversification to produce small area statistics for different fields in Cuban society should be a priority for the National Statistics Office. The key question about small area estimation is how to obtain reliable local statistics when the sample data contain too few observations for statistical inference of adequate precision. The social research presented here is focused on finding small area estimates which are more precise than the direct estimates of monthly mean income for people aged 15 and over at a municipal level. In this case, all 169 Cuban municipalities are considered small areas of interest. The empirical results obtained from this application are only intended to provide a first impression of the usefulness of applying small area estimation methods in Cuba. This study yields more precise estimates than the direct estimates for small areas/domains, even though in Cuba, as in any other developing country, the search for suitable auxiliary variables is used to “borrow strength” from neighbouring areas or domains may frequently be an important limitation.
EN
In the paper the results of small area estimation using empirical best linear unbiased predictor (EBLUP) for the data coming from Polish Household Budget Survey are presented. The results were obtained using small area models of household expenditures for regions. Estimation of sampling errors was conducted by means of the balanced repeated replication (BRR) technique. The estimation of EBLUPs and their corresponding mean square errors (MSE) was carried out using variance components technique. To calculate MSE of EBLUP the maximum likelihood method (ML) and restricted maximum likelihood method (REML) were used. The computation was made using SAE package designed for R-project.
EN
Żądło (2012) proposed a certain unit-level longitudinal model which was a special case of the General Linear Mixed Model. Two vectors of random components included in the model obey assumptions of simultaneous spatial autoregressive process (SAR) and temporal first-order autoregressive process (AR(1)) respectively. Moreover, it is assumed that the population can change in time and the population elements can change its domains’ (subpopulations’) affiliation in time. Under the proposed model, Żądło (2012) derived the Empirical Best Linear Unbiased Predictor (EBLUP) of the domain total. What is more (based on the theorem proved by Żądło (2009)), the approximate equation of the mean squared error (MSE) was derived and its estimator based on the Taylor approximation was proposed. The proposed MSE estimator was derived under some assumptions including that the variance-covariance matrix can be decomposed into linear combination of variance components. The assumption was not met under the proposed model. In the paper the jackknife MSE estimator for the derived EBLUP will be proposed based on the results presented by Jiang, Lahiri, Wan (2002). The bias of the jackknife MSE estimator will be compared in the simulation study with the bias of the MSE estimator based on the Taylor approximation.
EN
The paper presents an application of spatial microsimulation methods for generating a synthetic population to estimate personal income in Poland in 2011 using census tables and EU-SILC 2011 microdata set. The first section presents a research problem and a brief overview of modern estimation methods in application to small domains with particular emphasis on spatial microsimulation. The second section contains an overview of selected synthetic population generation methods. In the last section personal income estimation on NUTS 3 level is presented with special emphasis on the quality of estimates.
EN
In 2011, Germany conducted the first census after the reunification. In contrast to a classical census, a register-assisted census was implemented using population register data and an additional sample. This paper provides an overview of how the sampling design recommendations were set up in order to fulfil legal requirements and to guarantee an optimal but still flexible source of information. The aim was to develop a design that fosters an accurate estimation of the main objective of the census, the total population counts. Further, the design should also adequately support the application of small area estimation methods. Some empirical results are given to provide an assessment of selected methods. The research was conducted within the German Census Sampling and Estimation research project, financially supported by the German Federal Statistical Office.
EN
There is a growing demand for multivariate economic statistics for crossclassified domains. In business statistics, this demand poses a particular challenge given the specific character of the population of enterprises, which necessitates searching for methods of analysis that would represent the robust approach to estimation, where auxiliary variables could be utilised. The adoption of new solutions in this area is expected to increase the scope of statistical output and improve the precision of estimates. The study presented in the paper furthers this goal, as it is focused on testing the application of a robust version of the Fay-Herriot model, which makes it possible to meet the assumption of normality of random effects under the presence of outliers. These alternative models are supplied to estimate the parameters of small firms operating in 2012. Variables from administrative registers were used as auxiliary variables, which made the estimation process more comprehensive. The paper refers to small area estimation methods. The variables of interest are estimated at a low level of aggregation represented by the crosssection province and NACE sections.
EN
Skewed distributions with representative outliers pose a problem in many surveys. Various small area prediction approaches for skewed data based on transformation models have been proposed. However, in certain applications of those predictors, the fact that the survey data also contain a non-negligible number of zero-valued observations is sometimes dealt with rather crudely, for instance by arbitrarily adding a constant to each value (to allow zeroes to be considered as “positive observations, only smaller”, instead of acknowledging their qualitatively different nature). On the other hand, while a lognormal-logistic model has been proposed (to incorporate skewed distributions as well as zeroes), that model does not include any hierarchical aspects, and is therefore not explicitly adapted to small area prediction. In this paper, we consolidate the two approaches by extending one of the already established log-transformation mixed small area prediction models to incorporate a logistic component. This allows for the simultaneous, systematic treatment of domain effects, outliers and zero-valued observations in a single framework. We benchmark the resulting model-based predictors (against relevant alternatives) in applications to simulated data as well as empirical data from the Australian Agricultural and Grazing Industries Survey.
8
Publication available in full text mode
Content available

SAE Teaching Using Simulations

80%
EN
The increasing interest in applying small area estimation methods urges the needs for training in small area estimation. To better understand the behaviour of small area estimators in practice, simulations are a feasible way for evaluating and teaching properties of the estimators of interest. By designing such simulation studies, students gain a deeper understanding of small area estimation methods. Thus, we encourage to use appropriate simulations as an additional interactive tool in teaching small area estimation methods.
EN
Direct estimators used in sample surveys usually provide parameters’ estimates for country and regions. They do not provide estimates for smaller crosssections (age, gender etc.) or smaller geographical areas (subregions, counties, towns and communes). One of the possibilities to obtain such estimates is Bayes approach. It is based on known information beyond the sample. There were considered two Bayes estimators: empirical and hierarchical to obtain precise estimates for counties in agricultural sample surveys carried out by Central Statistical Office in Poland. Additional source of information was Census of Agriculture, whose data are correlated with data from agricultural sample surveys.
PL
W badaniach reprezentacyjnych, prowadzonych przez statystykę publiczną w Polsce i innych krajach, są stosowane estymatory bezpośrednie, oparte wyłącznie na wynikach z próby. Dostarczają one ocen parametrów dla podstawowych przekrojów kraju jako całości i dla większych obszarów, jak województwa. Natomiast nie dają ocen dla mniejszych przekrojów, jak: wiek, płeć itp. oraz dla mniejszych obszarów, jak: podregiony, powiaty, miasta, gminy. Jedną z możliwości uzyskania takich ocen jest podejście bayesowskie, oparte na znanej informacji spoza próby. W artykule rozważa się dwa estymatory bayesowskie: empiryczny i hierarchiczny, aby uzyskać precyzyjne oceny parametrów dla powiatów w reprezentacyjnych badaniach rolniczych prowadzonych przez GUS w Polsce. Źródłem informacji dodatkowych jest pełny spis rolny. Zastosowanie tych estymatorów daje oceny parametrów dla powiatów o dużej precyzji, w przypadku istnienia znacznej korelacji między wynikami z pełnego spisu rolnego i z reprezentacyjnych badań rolniczych prowadzonych po danym spisie.
EN
In the paper the estimation precision for the small area mean is considered for six synthetic estimators. For three estimators auxiliary data dealing with the whole population are used and for other estimators - data dealing with some groups of similar small areas. The group of similar small areas is determined on the basis of relative frequency distribution for the ratio of the considered variable and auxiliary variable for stratified sampling design without replacement. The obtained results are compared with the results of analogous experiments in which the group of similar small areas is determined with usage of the notation of series.
PL
W pracy rozpatrywana jest dokładność estymacji średniej dla małego obszaru w przypadku zastosowania sześciu estymatorów syntetycznych. Do wyznaczenia wartości trzech spośród nich wykorzystane są dane pomocnicze dotyczące całej populacji, a w pozostałych przypadkach - dane dotyczące grupy małych obszarów podobnych do rozpatrywanego. Grupę małych obszarów podobnych do danego wyznaczono, wykorzystując wskaźnik podobieństwa struktur odpowiadający ilorazowi zmiennej badanej i zmiennej pomocniczej w małym obszarze. Następnie przeprowadzono analizę Monte Carlo, w której dokonano porównania dokładności oszacowań średnich dla małych obszarów dla rozpatrywanych estymatorów w przypadku warstwowego losowania zależnego. Porównano też wyniki z wynikami analogicznego badania, w którym grupę podobnych małych obszarów wyznaczono, wykorzystując pojęcie serii.
11
80%
EN
In the paper we analyze the accuracy of the empirical best linear unbiased predictor (EBLUP) of the domain total (see Royall, 1976) assuming a special case of the general linear mixed model. To estimate the mean square error (MSE) of the EBLUP we use the results obtained by Datta and Lahiri (2000) for the predictor proposed by Henderson (1950) and adopt them for the predictor proposed by Royall (1976). In a simulation study we study real data on Polish farms from Dąbrowa Tarnowska region.
PL
W opracowaniu analizujemy dokładność empirycznych najlepszych liniowych nieobciążonych predyktorów wartości globalnej w domenie (ang. EBLUP - empirical best linear unbiased predictor) zakładając model nadpopulacji należący do klasy ogólnych mieszanych modeli liniowych. Do oceny błędu średniokwadratowego (ang. MSE - mean square error) predyktora typu EBLU wykorzystano rezultaty prezentowane przez Datta and Lahiri (2000) dla predyktora zaproponowanego przez Hendersona (1950) po zaadoptowaniu ich dla przypadku predyktora zaproponowanego przez Royalla (1976). W badaniu symulacyjnym wykorzystano rzeczywiste dane dotyczące gospodarstw rolnych w powiecie Dąbrowa Tarnowska uzyskane w spisie rolnym w 1996.
EN
The problem of modeling longitudinal profiles is considered assuming that the population and elements affiliation to subpopulations may change in time. The considerations are based on a model with auxiliary variables for longitudinal data with element and subpopulation specific random components (compare Verbeke, Molenberghs, 2000; Hedeker, Gibbons, 2006) which is a special case of the General Linear Model (GLM) the General Linear Mixed Model (GLMM). In the paper the pseudo-empirical best linear unbiased predictor (Pseudo-EBLUP) based on model-assisted approach will be presented along with its mean squared error (MSE) and its estimators. In the simulation study its accuracy will be compared with some calibration estimators which are based on model-assisted approach too.
PL
W opracowaniu jest analizowany problem predykcji frakcji i średniej w domenie z wykorzystaniem modeli nadpopulacji bez zmiennych dodatkowych uwzględniających podział populacji na warstwach. W rozważaniach symulacyjnych uwzględniono problem wpływu złej specyfikacji modelu nadpopulacji i szacowania liczebności populacji na dokładność predykcji.
EN
The problem of prediction of subpopulation (domain) total is studied as in Rao (2003). The problem is inspired by results obtained by Żądło (2012) who considered two predictors – empirical best linear unbiased predictor (EBLUP) under some correct model and some simpler misspecified predictor. In the simulation study he showed that the misspecified predictor may be in some cases more accurate than the EBLUP derived under the correct model what resulted from the decrease of accuracy of the EBLUP due to the estimation of unknown parameters of the correct model. But the problem occurred in the case of MSE estimation – under the correct model the bias of the MSE estimator derived under the misspecified model was very large. Hence, in the paper we consider a predictor based on some misspecified model and we derive some MSE estimator under the correct model and we propose usage of two other MSE estimators.
PL
Rozważany jest problem predykcji wartości globalnej w podpopulacji (domenie) jak w Rao (2003). Analizowane jest wykorzystanie predyktora, który jest empirycznym najlepszym liniowym nieobciążonym predyktorem, ale przy założeniu błędnego modelu. Dla rozważanego predyktora wyprowadzono postać naiwnego estymatora MSE dla prawidłowego modelu nadpopulacji oraz zaproponowano wykorzystanie estymatorów MSE typu jackknife i parametryczny bootstrap. W badaniu symulacyjnym analizowano względne obciążenia zaproponowanych estymatorów MSE.
EN
This article considers a robust hierarchical Bayesian approach to deal with random effects of small area means when some of these effects assume extreme values, resulting in outliers. In the presence of outliers, the standard Fay-Herriot model, used for modeling area-level data, under normality assumptions of random effects may overestimate the random effects variance, thus providing less than ideal shrinkage towards the synthetic regression predictions and inhibiting the borrowing of information. Even a small number of substantive outliers of random effects results in a large estimate of the random effects variance in the Fay-Herriot model, thereby achieving little shrinkage to the synthetic part of the model or little reduction in the posterior variance associated with the regular Bayes estimator for any of the small areas. While the scale mixture of normal distributions with a known mixing distribution for the random effects has been found to be effective in the presence of outliers, the solution depends on the mixing distribution. As a possible alternative solution to the problem, a two-component normal mixture model has been proposed, based on non-informative priors on the model variance parameters, regression coefficients and the mixing probability. Data analysis and simulation studies based on real, simulated and synthetic data show an advantage of the proposed method over the standard Bayesian Fay-Herriot solution derived under normality of random effects.
EN
The author presents a synthetic overview of recent efforts related to the small area estimation methods applied to the Polish Labor Force Survey (PLFS). The review concerns methodology and results obtained by Central Statistical Office connected with PLFS and National Census and some results obtained by the author of this paper. In the paper author discusses various methods of estimation together with evaluation of quality of such estimation. In particular the relationship between quality of Bayes estimates type and quality of a priori estimates and also type of applied method of estimation is presented.
PL
Referat przedstawia syntetyczny przegląd przeprowadzonych ostatnio badań, dotyczących zastosowania metod statystyki małych obszarów, z użyciem wyników z Badania Aktywności Ekonomicznej Ludności. Przegląd dotyczy zagadnień metodologicznych oraz wyników otrzymanych przez Główny Urząd Statystyczny, związanych z BAEL oraz Spisem Powszechnym 2002, jak również wynikami otrzymanymi przez autora niniejszego referatu. W referacie dyskutowane są różne metody estymacji, łącznie z szacunkami ich jakości. W szczególności przedstawione została zależność jakości danych szacowanych z użyciem metod bayesowskich od jakości szacunków a priori oraz rodzaju zastosowanej metody estymacji.
EN
In economic studies researchers are oeftn interested in the estimation of the distribution function or certain functions of the distribution function such as quantiles. This work focuses on the estimation quantiles as inverses of the estimates of the distribution function in the presence of auxiliary information that is correlated with the study variable. In the paper a plug-in estimator of the distribution function is proposed which is used to obtain quantiles in the population and in the small areas. Performance of the proposed method is compared with other estimators of the distribution function and quantiles using the simulation study. The obtained results show that the proposed method usually has smaller relative biases and relative RMSE comparing to other methods of obtaining quantiles based on inverting the distribution function.
EN
The problem of small area prediction is considered under a Linear Mixed Model. The article presents a proposal of an empirical best linear unbiased predictor under a model with two correlated random effects. The main aim of the simulation analyses is a study of an influence of the occurrence of a correlation between random effects on properties of the predictor. In the article, an increase of the accuracy due to the correlation between random effects and an influence of model misspecification in cases of the lack of correlation between random effects are analyzed. The problem of the estimation of the Mean Squared Error of the proposed predictor is also considered. The Monte Carlo simulation analyses and the application were prepared in R language.
PL
Zagadnieniem poruszanym w artykule jest problem predykcji w przypadku pewnego modelu należącego do klasy liniowych modeli mieszanych. W opracowaniu została przedstawiona propozycja empirycznego najlepszego liniowego nieobciążonego predyktora dla liniowego modelu mieszanego z dwoma skorelowanymi efektami losowymi. Głównym celem opracowania jest symulacyjne zbadanie wpływu występowania zależności między efektami losowymi na własności rozważanego predyktora. W artykule podjęto również problem estymacji błędu średniokwadratowego zaproponowanego predyktora. Badanie symulacyjne oraz przykład przygotowano z użyciem programu R.
EN
In this paper, we first develop a triple-goal small area estimation methodology for simultaneous estimation of unemployment rates for U.S. states using the Current Population Survey (CPS) data and a two-level random sampling variance normal model. The main goal of this paper is to illustrate the utility of the triple-goal methodology in generating a single series of unemployment rate estimates for three separate purposes: developing estimates for individual small area means, producing empirical distribution function (EDF) of true small area means, and the ranking of the small areas by true small area means. We achieve our goal using a Monte Carlo simulation experiment and a real data analysis.
EN
The EU Statistics on Income and Living Conditions (EU-SILC) has provided annual estimates of the number of labour market indicators for EU countries since 2003, with an almost exclusive focus on national rates. However, it is impossible to obtain reliable direct estimates of labour market statistics at low levels based on the EU-SILC survey. In such cases, modelbased small area estimation can be used. In this paper, the low work intensity indicator for the spatial domains in Poland between 2005-2012 was estimated. The Rao and You (1994), Fay and Diallo (2012), and Marhuenda, Molina and Morales (2013) models were applied. The bootstrap MSE for the discussed methods was proposed. The results indicate that these models provide more reliable estimates than direct estimation.
first rewind previous Page / 2 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.