Results found: 2

Search results

Search:
in the keywords: mean shift method

Sort By:

Limit search:

A Proposal of New Classification Algorithm

100%

Korzeniewski J., University of Łódź D. o. S. M.

Acta Universitatis Lodziensis. Folia Oeconomica

2009

vol. 225

In the paper a new method of classifying points to a predetermined number of classes is presented. The method is based on the use of the sample/window mean shift technique to obtain a synthetic insight into the data set structure. The method's performance is tested on Euclidean space data sets generated by the Milligan's CLUSTGEN programme through comparison with the grouping obtained by the k-means method. The criterion applied are the Rousseeuw's silhouette indices are used as a criterion for comparison.

W artykule przedstawiona jest nowa metoda klasyfikowania punktów zbioru danych do klas, których liczba jest zadana. Metoda oparta jest na wykorzystaniu techniki średniego przesunięcia okna/próby do uzyskania syntetycznego wglądu w strukturę zbioru danych. Działanie metody jest sprawdzone na zbiorach danych z przestrzeni euklidesowych wygenerowanych przy pomocy programu CLUSTGEN poprzez porównanie wyników z grupowaniem uzyskanym metodą k-średnich. Kryterium porównawczym są indeksy sylwetkowe Rousseeuwa.

Comparative Assessment of Some Selected Methods of Determining the Number of Clusters in a Data Set

100%

Korzeniewski J., University of Łódź D. o. S. M.

Acta Universitatis Lodziensis. Folia Oeconomica

2007

vol. 206

This paper is an attempt to compare the performance of an algorithm for determining the number of clusters in a data set proposed by the author with other methods of determining the number of clusters. The idea of the new algorithm is based on the comparison of pseudo cumulative distribution functions of a certain random variable. For a fixed window size we draw К different points and for every point we find the corresponding limiting point in the mean shift procedure. Then we check if the distance (e.g. Euclidean) between every pair of the limiting points is greater than the window size. Analogously we determine the pseudo cumulative distribution functions for different numbers К of clusters. Out of all pseudo cumulative distribution functions we pick the proper one i.e. the last one” (with respect to K) which has a horizontal phase. Other methods of determining the number of clusters in a data set are compared with the proposed algorithm in a number of examples of two dimensional data sets for different clustering methods (k-means clustering and minimum distance agglomeration).

Artykuł niniejszy jest próbą oceny porównawczej algorytmu wyznaczającego ilość skupień w zbiorze danych, zaproponowanego przez autora, z innymi metodami wyznaczania ilości skupień. Algorytm autora oparty jest na porównaniu pseudodystrybuant pewnej zmiennej losowej dla różnych ilości skupień. Ta zmienna losowa jest zdefiniowana w następujący sposób. Dla ustalonego rozmiaru okna losujemy ze zbioru danych К różnych punktów i dla każdego z tych punktów znajdujemy odpowiadający mu punkt graniczny w procedurze średniego przesunięcia próby. Następnie sprawdzamy, czy odległość (np. euklidesowa) pomiędzy każdą parą punktów granicznych jest większa od rozmiaru okna. Analogicznie wyznaczamy pseudodystrybuanty dla różnych ilości К skupień. Ze wszystkich dystrybuant za prawidłowo określającą ilość skupień uznajemy tę, która odpowiada ostatniej (względem K) krzywej, posiadającej fazę poziomą. Inne metody określania liczby skupień w zbiorze danych są porównane z zaproponowanym algorytmem na przykładach kilku dwuwymiarowych zbiorów danych dla dwóch, diametralnie różnych w naturze, metod konstruowania skupień.

Refine search results

2 Acta Universitatis Lodziensis. Folia Oeconomica

2 Korzeniewski J.

2 University of Łódź D. o. S. M.

1 2009

1 2007

Search results

A Proposal of New Classification Algorithm

Comparative Assessment of Some Selected Methods of Determining the Number of Clusters in a Data Set