2014 | 3(37) | 43-50
Article title

Numerical Data Clustering Algorithms in Mining Real Estate Listings

Title variants
Algorytmy klasteryzacji danych numerycznych w eksploracji serwisów ogłoszeń nieruchomości
Languages of publication
In the paper, we propose a method for mining real-estate listings using clustering algorithms intended for numerical data. The presented approach is based on information systems over ontological graphs. Such information systems have been proposed to deal with data in the form of concepts linked by different semantic relations. A special attention is focused on preprocessing steps transforming advertisements in the textual form into information systems defined over ontological graphs, as well as on encoding attribute values for clustering algorithms.
W artykule zaproponowano metodę eksploracji serwisów ogłoszeń nieruchomości przy użyciu algorytmów klasteryzacji przeznaczonych dla danych numerycznych. Przedstawione podejście bazuje na systemach informacyjnych nad grafami ontologicznymi. Systemy informacyjne tego typu zaproponowane zostały w celu poradzenia sobie z danymi w postaci pojęć powiązanych ze sobą za pomocą różnych relacji semantycznych. Specjalna uwaga została zwrócona na etap wstępnego przetwarzania danych z ogłoszeń w postaci tekstowej do postaci systemów informacyjnych zdefiniowanych nad grafami ontologicznymi jak również na kodowanie wartości atrybutów dla algorytmów klasteryzacji.
  • University of Management and Administration in Zamość, Poland
  • University of Information Technology and Management in Rzeszów, Poland
  • University of Management and Administration in Zamość, Poland
  • Brachman, R.J. 1983. “What Is-a Is and Isnt — an Analysis of Taxonomic Links in Semantic Networks.” Computer no. 16 (10):30–36.
  • Bramer, M.A. 2007. Principles of Data Mining, Undergraduate Topics in Computer Science. London: Springer.
  • Chaffin, R., D.J. Herrmann, and M. Winston. 1988. “An Empirical Taxonomy of Part-Whole Relations. Effects of Part-Whole Relation Type on Relation Identification.” Language, Cognition and Neuroscience no. 1 (3):17–48.
  • Cios, K.J., W. Pedrycz, R.W. Swiniarski, and L. Kurgan. 2007. Data Mining. A Knowledge Discovery Approach. New York: Springer.
  • Gan, G., C. Ma, and J. Wu. 2007. Data Clustering. Theory, Algorithms, and Aplications, ASA-SIAM series on statistics and applied probability. Philadelphia, Pa.; Alexandria, Va.: SIAM; American Statistical Association.
  • Kaufman, L., and P.J. Rousseeuw. 1990. Finding Groups in Data. An Introduction to Cluster Analysis, Wiley series in probability and mathematical statistics, Applied probability and statistics. New York: Wiley.
  • Li, Y.H., Z.A. Bandar, and D. McLean. 2003. “An Approach for Measuring Semantic Similarity Between Words Using Multiple Information Sources.” IEEE Transactions on Knowledge and Data Engineering no. 15 (4):871–882.
  • Murphy, M.L. 2008. Semantic Relations and the Lexicon. Antonymy, Synonymy, and Other Paradigms. Cambridge, UK; New York: Cambridge University Press.
  • Neches, R., R. Fikes, T. Finin, T. Gruber, R. Patil, T. Senator, and W.R. Swartout. 1991. “Enabling Technology for Knowledge Sharing.” AI Magazine no. 12 (3):36–56.
  • Pancerz, K. 2012. “Toward Information Systems over Ontological Graphs.” In Rough Sets and Current Trends in Computing, edited by J. Yao, Y. Yang, R. Słowiński, S. Greco, H. Li, S. Mitra and L. Polkowski, 243–248. Berlin–Heidelberg: Springer.
  • Pancerz, K., and A. Lewicki. 2014. “Encoding Symbolic Features in Simple Decision Systems over Ontological Graphs for PSO and Neural Network Based Classifiers.” Neurocomputing no. 144:338–345. doi: 10.1016/j.neucom.2014.04.038.
  • Pancerz, K., and O. Mich. 2014. Mining Real-Estate Listings Based on Decision Systems over Ontological Graphs: Extended Abstract. Paper read at Proceedings of the Workshop on Concurrency, Specification and Programming (CS&P’2014), 2014.09.29–10.01, at Chemnitz, Germany.
  • Pawlak, Z. 1991. Rough Sets. Theoretical Aspects of Reasoning about Data, Theory and decision library Series D, System theory, knowledge engineering, and problem solving. Dordrecht-Boston: Kluwer Academic Publishers.
  • Qiu, T., L. Liu, L. Duan, S. Zhou, and H. Huang. 2012. “A Rough Set Model for Incomplete and Multi-Valued Information Systems.” International Journal of Digital Content Technology and its Applications no. 6 (20):53–61. doi: 10.4156/jdcta.vol6.issue20.6.
  • Rosenfeld, L. 2011. Search Analytics for Your Site. Conversations with Your Customers. Brooklyn, N.Y.: Rosenfeld Media.
Document Type
Publication order reference
YADDA identifier
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.