Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

PL EN


2015 | 247 | 69-83

Article title

Identification of multivariate outliers – problems and challenges of visualization methods

Content

Title variants

PL
Identyfikacja wielowymiarowych obserwacji odstających– problemy i wyzwania metod wizualizacyjnych

Languages of publication

EN

Abstracts

EN
The identification of outliers is often thought of as a means to eliminate observations from a data set to avoid disturbance in further analyses. But outliers may as well be the interesting observations in themselves, because they can give us hints about certain structures in the data or about special events during the sampling period. Therefore, appropriate methods for the detection of outliers are needed. Literature is abundant with procedures for detection and testing of single outliers in sample data. The difficulty of detection increases with the number of outliers and the dimension of the data because the outliers can be extreme in any growing number of directions. An overview of multivariate outlier detection methods that are provided in this study because of its growing importance in a wide variety of practical situations. We focus on methods that can be visually presented.
PL
Proces identyfikacji obserwacji odstających jest często rozważany jako wstęp do eliminacji obserwacji nietypowych ze zbiorów danych w celu uniknięcia jakichkolwiek problemów w dalszej analizie danych. Tymczasem obserwacje nietypowe dostarczają niejednokrotnie istotnych informacji o strukturze danych lub wyjątkowych zdarzeniach podczas badanego okresu. Dlatego potrzebne są właściwe metody identyfikacji tychże obserwacji. Literatura jest bogata w metody wykrywania obserwacji nietypowych w jednowymiarowych przypadkach. W wielowymiarowej przestrzeni proces ten znacznie się komplikuje. W artykule prezentujemy wybrane metody wizualizacyjne wykrywania wielowymiarowych obserwacji nietypowych.

Year

Volume

247

Pages

69-83

Physical description

Contributors

References

  • Acuna E., Rodriguez C.A. (2004), Meta Analysis Study of Outlier Detection Methods in Classification, Technical paper, University of Puerto Rico at Mayaguez, Proceedings IPSI 2004, Venice.
  • Aguinis H., Gottfredson R.K., Joo H. (2013), Best-Practice Recommendations for Defining, Identifying, and Handling Outliers, “Organizational Research Methods”, p. 270-301.
  • Barnett V., Lewis T. (1994), Outliers in Statistical Data (2nd Edition), John Wiley and Sons.
  • Becker C., Gather U. (1999), The Masking Breakdown Point of Multivariate Outlier Identification Rules, “Journal of the American Statistical Association” 94, p. 947-955.
  • Ben-Gal I. (2005), Outlier Detection [in:] O. Maimon, L. Rockach (eds.), Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers, Kluwer Academic Publishers.
  • Breunig M.M., Kriegel H.P., Ng R.T., Sander J. (2000), Identifying Density-based Local Outliers, Proceedings ACMSIGMOD 2000, p. 93-104.
  • Booth D.E., Alam P., Ahkam S.N., Osyk B. (1989), A Robust Multivariate Procedure for the Identification of Problem Savings and Loan Institutions, “Decision Sciences”, 20, p. 320-333.
  • Butler R.W., Davies P.L. and Jhun M. (1993), Asymptotics for the Minimum Covariance Determinant Estimator, “The Annals of Statistics”, 21, p. 1385-1400.
  • Caussinus H., Roiz A. (1990), Interesting Projections of Multidimensional Data by Means of Generalized Component Analysis, COMPSTAT90, Physica-Verlag, Heidelberg, p. 121-126.
  • Croux C., Ruiz-Gazen A. (2005), High Breakdown Estimators for Principal Components: The Projection-pursuit Approach Revisited, “Journal of Multivariate Analysis”, 95(1), p. 206-226.
  • Fawcett T., Provost F. (1997), Adaptive Fraud Detection, “Data-mining and Knowledge Discovery”, 1(3), p. 291-316.
  • Filzmoser P., Maronna R., Werner M. (2008), Outlier Identification in High Dimensions, “Computational Statistics and Data Analysis”, 52, p. 1694-1711.
  • Hadi A.S. (1992), Identifying Multiple Outliers in Multivariate Data, “Journal of the Royal Statistical Society”, Series B, 54, p. 761-771.
  • Hawkins D.M. (1980), Identification of Outliers, Chapman and Hall, London.
  • Human Mortality Database (2015), University of California, Berkeley (USA), and Max Planck Institute for Demographical Research (Germany), viewed 15/09/07, available online at: www.mortality.org.
  • Hyndman R.J., Shang H.L. (2008), Rainbow Plots, Bagplots, and Boxplots for Functional Data, “Journal of Computational and Graphical Statistics” 19(1), p. 29-45.
  • Iglewics B., Martinez J. (1982), Outlier Detection Using Robust Measures of Scale, “Journal of Statistical Computation and Simulation”, 15, p. 285-293.
  • Peña D., Prieto F.J. (2001), Multivariate Outlier Detection and Robust Covariance Matrix Estimation, “Technometrics”, 43, p. 286-300.
  • Penny K.I., Jolliffe I.T. (2001), A Comparison of Multivariate Outlier Detection Methods for Clinical Laboratory Safety Data, “The Statistician”, 50(3), p. 295-308.
  • Rocke D.M., Woodruff D.L. (1996), Identification of Outliers in Multivariate Data, “Journal of the American Statistical Association” 91, p. 1047-1061.
  • Rousseeuw P. (1985), Multivariate Estimation with High Breakdown Point [in:] W. Grossmann et al. (eds.), “Mathematical Statistics and Applications”, Vol. B, p. 283-297.
  • Rousseeuw P.J., Driessen K.A. van (1999), Fast Algorithm for the Minimum Covariance Determinant Estimator, “Technometrics”, 41, p. 212-223.
  • Rousseeuw P.J., Katrien V.D. (1999), A Fast Algorithm for the Minimum Covariance Determinant Estimator, “Technometrics”, 41(3), p. 212-223.
  • Rousseeuw P., Leroy A. (1987), Robust Regression and Outlier Detection, Wiley Series in Probability and Statistics.
  • Rousseeuw P., Ruts I., Tukey J. (1999), The Bagplot: A Bivariate Boxplot, “The American Statistician”, 53(4), p. 382-387.
  • Rousseeuw P.J., Zomeren B.C. van (1990), Unmasking Multivariate Outliers and Leverage Points, “Journal of the American Statistical Association”, 85(411), p. 633-651.
  • Schwager S.J., Margolin B.H. (1982), Detection of Multivariate Normal Outliers, “Annals of Statistics”, 10, p. 943-95.

Document Type

Publication order reference

Identifiers

ISSN
2083-8611

YADDA identifier

bwmeta1.element.cejsh-82bc3d8a-6e5c-4eff-8e31-e21cb6beee0d
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.