PL EN


2016 | 5 | 1 | 24-35
Article title

EFFECTIVE MULTI-LABEL CLASSIFICATION METHOD WITH APPLICATIONS TO TEXT DOCUMENT CATEGORIZATION

Content
Title variants
Languages of publication
EN
Abstracts
EN
Increasing number of repositories of online documents resulted in growing demand for automatic categorization algorithms. However, in many cases the texts should be assigned to more than one class. In the paper, new multi-label classification algorithm for short documents is considered. The presented problem transformation Labels Chain (LC) algorithm is based on relationship between labels, and consecutively uses result labels as new attributes in the following classification process. The method is validated by experiments conducted on several real text datasets of restaurant reviews, with different number of instances, taking into account such classifiers as kNN, Naive Bayes, SVM and C4.5. The obtained results showed the good performance of the LC method, comparing to the problem transformation methods like Binary Relevance and Label Powerset.
Year
Volume
5
Issue
1
Pages
24-35
Physical description
Dates
published
2016
Contributors
author
  • Institute of Information Technology, Lodz University of Technology
  • Institute of Information Technology, Lodz University of Technology
References
  • Glinka K., Zakrzewska D. (2015) Effective Multi-label Classification Method for Multi-dimensional Datasets, Proceeding of the 11th International Conference FQAS 2015, Cracow, Poland, 127–138.
  • Schapire R.E., Singer Y. (2000) BoosTexter: A boosting-based system for text categori-zation, Machine learning 39(2/3), 135-168.
  • Li T., Ogihara M. (2004) Content-based music similarity search and emotion detection, Proceeding of IEEE International Conference on Acoustic, Speech and Signal Processing (volume 5), Canada, 705–708.
  • Tsoumakas G., Katakis I., Vlahavas I. (2010) Mining Multi-label Data, Maimon O., Rokach L. [ed.]: Data Mining and Knowledge Discovery Handbook, Springer US, Bos-ton, MA, 667-685.
  • Madjarov G., Kocev D., Gjorgjevikj D., Dẑeroski S. (2012) An extensive experimental comparison of methods for multi-label learning, Pattern Recognition 45(9), 3084-3104.
  • Sajnani H., Javanmardi S., McDonald D.W., Lopes C.V. (2011) Multi-label classification of short text: A study on wikipedia barnstars, Analyzing Microtext: Papers from the 2011 AAAI Workshop.
  • Boutell M.R., Luo J., Shen X., Brown C.M. (2004) Learning multi-label scene classifi-cation, Pattern Recognition 37(9), 1757–1771.
  • Esuli A., Fagni T., Sebastiani F. (2008) Boosting multi-label hierarchical text categori-zation, Information Retrieval 11(4), 287-313.
  • Comité F.D., Gilleron R., Tommasi M. (2003) Learning multi-label alternating decision decision tree from text and data, Lecture Notes in Computer Science, vol. 2734, Springer, Heidelberg, 35-49.
  • Lee S.-J., Jiang J.-Y. (2014) Multilabel text categorization based on fuzzy relevance clustering, IEEE Transactions on Fuzzy Systems 22(6), 1457-1471.
  • Read J., Pfahringer B., Holmes G., Frank E. (2009) Classifier Chains for Multi-label Classification, Buntine W., Grobelnik M., Mladenic, D., Shawe-Taylor J. [ed.]: Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, vol. 5782, Springer, Heidelberg, 254–269.
  • Kajdanowicz T., Kazienko P. (2012) Multi-label classification using error correcting out-put codes, Applied Mathematics and Computer Science 22(4), 829–840.
  • http://www.yelp.com/
  • http://www.ics.uci.edu/~vpsaini/
  • Koehn P. (2010) Statistical Machine Translator, Cambridge University Press, UK.
  • Witten I.H., Frank E., Hall M.A. (2011) Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, USA.
  • http://www.cs.waikato.ac.nz/ml/weka/index.html
Document Type
Publication order reference
Identifiers
ISSN
2084-5537
YADDA identifier
bwmeta1.element.desklight-82863176-73b7-4917-ad40-bc6afe66ee7f
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.