2014 | 191 | 75-88
Article title

O jakości danych w kontekście obserwacji oddalonych w wielowymiarowej analizie regresji

Title variants
On Selected Data Quality Issues in Multivariate Regression Analysis
Languages of publication
The paper presents different definitions of outliers. We also collate selected outlier detection techniques, which represent very different approaches to outliers identification: classical univariate method embodied in boxplots, Andrews' curves, methods based on Cook's distance and Mahalonobis' distance, local outlier factor method, support vector machines. Moreover we empirically examine the agreement between the results of outlier detection methods on the benchmarking, real world dataset.
Physical description
  • Andrews D.F., Plots of High-Dimensional Data, "Biometrics" 1972, Vol. 28, No. 1, s. 125-136.
  • Barnett V., Lewis T., Outliers in Statistical Data, 3rd Edition, John Wiley & Sons, New York 1998.
  • Ben-Hur A., Horn D., Siegelman H.T., Vapnik V., Support Vector Clustering, "Journal of Machine Learning Research" 2001, Vol. 2, s. 125-137.
  • Breunig M.M., Kriegel H.-P., Ng R.T., Sander J., LOF: Identifying Density-Based Outliers, Proceedings of the 29th ACM SIDMOD International Conference on Management of Data (SIGMOD 2000), Dallas 2000, s. 93-104.
  • Cook R.D., Detection of Influential Observations in Linear Regression, "Technometrics" 1977, 19 (1), s. 15-18.
  • Duda R.O., Hart P.E., Stork D.G., Pattern Classification, John Wiley & Sons, New York 2001.
  • Filzmoser P., Maronna R.A., Werner M., Outlier Identification in High Dimensions, "Computational Statistics & Data Analysis" 2008, Vol. 52, s. 1694-1711.
  • Giudici P., Applied Data Mining: Statistical Methods for Business and Industry, John Wiley & Sons, New York 2003.
  • Hawkins D., Identification of Outliers, Chapman and Hall, London 1980.
  • Healy M.J.R., Multivariate Normal Plotting, "Applied Statistics" 1968, Vol. 17, s. 157-161.
  • Huber P.J., Ronchetti E.M., Robust Statistics, 2nd Edition, John Wiley & Sons, Hoboken, NJ 2009.
  • Maddala G.S., Ekonometria, Wydawnictwo Naukowe PWN, Warszawa 2006.
  • Maronna R.A., Martin R.D., Yohai V.J., Robust Statistics: Theory and Methods, John Wiley & Sons, Chichester 2006.
  • Rousseeuw P.J., Least Median of Squares Regression, "Journal of the American Statistical Association" 1984, Vol. 79, s. 871-880.
  • Rousseeuw P.J., Leroy A.M., Robust Regression and Outlier Detection, John Wiley & Sons, New York 2003.
  • Trzęsiok M., Identyfikacja obserwacji oddalonych z wykorzystaniem metody wektorów nośnych, [w:] Taksonomia 14. Klasyfikacja i analiza danych - teoria i zastosowania, red. K. Jajuga, M. Walesiak, Wydawnictwo Naukowe Akademii Ekonomicznej, Wrocław 2007, s. 350-357.
  • Tukey J.W., Exploratory Data Analysis, Addison-Wesley, Boston 1977.
  • Webb A.R., Statistical Pattern Recognition, Second Edition, John Wiley & Sons, New York 2002.
Document Type
Publication order reference
YADDA identifier
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.