Regression analysis for interval-valued symbolic data versus noisy variables and outliers

Pełka, Marcin; Dudek, Andrzej

Article details

Journal

Econometrics. Ekonometria. Advances in Applied Data Analytics

2016 | 2 (52) | 35-42

Article title

Regression analysis for interval-valued symbolic data versus noisy variables and outliers

Authors

Pełka Marcin , Dudek Andrzej

Content

Full texts:

Download

Title variants

PL

Regresja liniowa danych symbolicznych a zmienne zakłócające i obserwacje odstające

Languages of publication

EN

Abstracts

EN

Regression analysis is perhaps the best known and most widely used method used for the analysis of dependence; that is, for examining the relationship between a set of independent variables (X’s) and a single dependent variable (Y). In general regression, the model is a linear combination of independent variables that corresponds as closely as possible to the dependent variable [Lattin, Carroll, Green 2003, p. 38]. The aim of the article is to present two suitable adaptations for a regression analysis of symbolic interval-valued data (centre method and centre and range method) and to compare their usefulness when dealing with noisy variables and/or outliers. The empirical part of the paper presents the results of simulation studies based on artificial and real data, without noisy variables and/or outliers and with noisy variable and outliers. The results are compared according to the values of two coefficients of determination 2 RL and 2 . RU The results show that usually the centre and range method obtains better results even when the data set contains noisy variables and outliers, but in some cases the centre method obtains better results than the centre and range method.

Keywords

EN

regression analysis interval-valued symbolic data noisy variables outliers

Publisher

Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu

Journal

Econometrics. Ekonometria. Advances in Applied Data Analytics

Year

2016

Issue

2 (52)

Pages

35-42

Physical description

Contributors

author

Pełka Marcin

author

Dudek Andrzej

References

Billard L., Diday E., 2006, Symbolic Data Analysis. Conceptual Statistics and Data Mining, John Wiley & Sons, Chichester.
Bock H.-H., Diday E. (eds.), 2000, Analysis of Symbolic Data. Explanatory Methods for Extracting Statistical Information from Complex Data, Springer Verlag, Berlin-Heidelberg.
Diday E., Noirhomme-Fraiture M., 2008, Symbolic Data Analysis. Conceptual Statistics and Data Mining, Wiley&Sons, Chichester.
Dudek A., 2013, Metody analizy danych symbolicznych w badaniach ekonomicznych, Wyd. UE we Wrocławiu, Wrocław.
Hair J.F., Black W.C., Babim B.J., Anderson R.E., Tatham R.L., 2006, Multivariate Data Analysis, Prentice Hall, New Jersey.
Lattin J., Carroll J.D., Green P.E., 2003, Analyzing Multivariate Data, Thomson Learning, Toronto.
Lima-Neto E.A., de Carvalho F.A.T., 2008, Centre and range method to fitting a linear regression model on symbolic interval data, Computational Statistics and Data Analysis, vol. 52, pp. 1500–1515.
Lima-Neto E.A., de Carvalho F.A.T., 2010, Constrained linear regression models for symbolic interval-valued variables, Computational Statistics and Data Analysis, vol. 54, pp. 333–347.
Milligan G.W., Cooper M.C., An examination of procedures for determining the number of clusters in a data set, Psychometrika, vol. 50, no. 2, pp. 159–179.
Qiu W., Joe H., 2006, Generation of Random Clusters with Specified Degree of Separation. Journal of Classification, vol. 23, pp. 315-334.
Walesiak M., Dudek A., 2014, The clusterSim package [URL:] www.r-project.org.
Walesiak M., Gatnar E. (eds.), 2004, Metody statystycznej analizy wielowymiarowej w badaniach marketingowych, Wyd. Akademii Ekonomicznej im. Oskara Langego we Wrocławiu, Wrocław.
Welfe A., 2013, Ekonometria, PWN, Warszawa.

Article details

Journal

Econometrics. Ekonometria. Advances in Applied Data Analytics

Article title

Regression analysis for interval-valued symbolic data versus noisy variables and outliers

Authors

Content

Title variants

Languages of publication

Abstracts

Keywords

Publisher

Journal

Year

Issue

Pages

Physical description

Contributors

References

Document Type

Publication order reference

Identifiers

YADDA identifier