Results found: 2

Search results

Search:
in the keywords: computer algorithm

Sort By:

Limit search:

Comparative Assessment of Some Selected Methods of Determining the Number of Clusters in a Data Set

100%

Korzeniewski J., University of Łódź D. o. S. M.

Acta Universitatis Lodziensis. Folia Oeconomica

2007

vol. 206

This paper is an attempt to compare the performance of an algorithm for determining the number of clusters in a data set proposed by the author with other methods of determining the number of clusters. The idea of the new algorithm is based on the comparison of pseudo cumulative distribution functions of a certain random variable. For a fixed window size we draw К different points and for every point we find the corresponding limiting point in the mean shift procedure. Then we check if the distance (e.g. Euclidean) between every pair of the limiting points is greater than the window size. Analogously we determine the pseudo cumulative distribution functions for different numbers К of clusters. Out of all pseudo cumulative distribution functions we pick the proper one i.e. the last one” (with respect to K) which has a horizontal phase. Other methods of determining the number of clusters in a data set are compared with the proposed algorithm in a number of examples of two dimensional data sets for different clustering methods (k-means clustering and minimum distance agglomeration).

Artykuł niniejszy jest próbą oceny porównawczej algorytmu wyznaczającego ilość skupień w zbiorze danych, zaproponowanego przez autora, z innymi metodami wyznaczania ilości skupień. Algorytm autora oparty jest na porównaniu pseudodystrybuant pewnej zmiennej losowej dla różnych ilości skupień. Ta zmienna losowa jest zdefiniowana w następujący sposób. Dla ustalonego rozmiaru okna losujemy ze zbioru danych К różnych punktów i dla każdego z tych punktów znajdujemy odpowiadający mu punkt graniczny w procedurze średniego przesunięcia próby. Następnie sprawdzamy, czy odległość (np. euklidesowa) pomiędzy każdą parą punktów granicznych jest większa od rozmiaru okna. Analogicznie wyznaczamy pseudodystrybuanty dla różnych ilości К skupień. Ze wszystkich dystrybuant za prawidłowo określającą ilość skupień uznajemy tę, która odpowiada ostatniej (względem K) krzywej, posiadającej fazę poziomą. Inne metody określania liczby skupień w zbiorze danych są porównane z zaproponowanym algorytmem na przykładach kilku dwuwymiarowych zbiorów danych dla dwóch, diametralnie różnych w naturze, metod konstruowania skupień.

Analysis of Point Processes Observed with Noise with Applicational Example

75%

Korzeniewski J., University of Łódź C. o. S. M.

Acta Universitatis Lodziensis. Folia Oeconomica

2005

vol. 194

Przykładem zastosowania procesów punktowych obserwowanych wraz z szumem są zdjęcia lotnicze lasów robione w celu oszacowania ubytków leśnych na danym terenie. Rudemo i Lund (2000) zaproponowali model, który może być użyteczny w tym celu, wykorzystujący liczbę „kandydatów na drzewa” widocznych na zdjęciu. Parametry warunkowej funkcji wiarygodności zostały oszacowane z uwzględnieniem takich odmian szumu, jak znikanie punktów, przemieszczanie się punktów oraz pojawianie się punktów fałszywych. To podejście nie rozwiązuje problemu szacowania faktycznej liczby drzew. W artykule tym zaproponowano nowy algorytm, który bezpośrednio szacuje faktyczną liczbę prawdziwych drzew. Jedynym koniecznym założeniem jest założenie o stałej gęstości zalesienia na danym obszarze lasu. Rezultaty uzyskane za pomocą nowego algorytmu można ocenić jak o interesujące.

An example of the application of point processes observed with noise are aerial photographs of forests with the aim of estimating the actual number of trees on a given area. Lund and Rudemo (2000) proposed a model useful in this context, basing on the number of “trees candidates” visible on the photograph. The parameters of conditional likelihood function were estimated taking into account such variations of noise as points thinning, points displacement and appearing of extra ghost points. The approach proposed does not solve the problem of the estimation of the actual number of trees. In this paper a new algorithm to estimate directly the number of actual trees is proposed. The only assumption on which the new measure depends is the natural assumption about forest density being locally constant. The results achieved with the help of the new measure may be assessed as interesting.

Refine search results

2 Acta Universitatis Lodziensis. Folia Oeconomica

2 Korzeniewski J.

1 University of Łódź C. o. S. M.

1 University of Łódź D. o. S. M.

1 2007

1 2005

Search results

Comparative Assessment of Some Selected Methods of Determining the Number of Clusters in a Data Set

Analysis of Point Processes Observed with Noise with Applicational Example