Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

PL EN


2014 | 3 | 4 | 229-238

Article title

Automatic Indexer for Polish Agricultural Texts

Content

Title variants

Languages of publication

EN

Abstracts

EN
Today, the majority of resources are available in digital forms to acquire information. We have to search through collections of documents. In this paper text indexing which can improve searching is described. Next, indexing tool, the Agrotagger, which is useful for documents in the field of agriculture, is presented. Two available versions of the Agrotagger are tested and discussed. The Agrotagger is useful only for the English language despite the fact that it uses multilingual thesaurus Agrovoc. Because of the Agrotagger is not useful for texts in Polish, it is important to create similar tool appropriate for the Polish language. The problems connected with extensive inflection in languages such as Polish language in the process of indexing were discussed. In the final part of the paper, it is presented design and implementation of a system, based on the Polish language dictionary and the Agrovoc. Additionally some tests of implemented system are discussed.

Year

Volume

3

Issue

4

Pages

229-238

Physical description

Dates

published
2014

Contributors

  • Department of Informatics, Warsaw University of Life Sciences
  • Department of Informatics, Warsaw University of Life Sciences

References

  • AgroTagger. http://aims.fao.org/agrotagger (access 19.11.2014).
  • AGROVOC, http://aims.fao.org/standards/agrovoc/about/ (access 19.11.2014).
  • Dolamic, L. Savoy, J. (2008) Stemming Approaches for East European Languages. Advances in Multilingual and Multimodal Information Retrieval, Vol. 5152, 37-44.
  • Gupta S., C.D. Manning, (2011) Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers, In Proceedings of the International Joint Conference on Natural Language Processing, http://nlp.stanford.edu/pubs/gupta-manning-ijcnlp11.pdf (access 19.11.2014).
  • Jurafsky, D., Martin J. H. (2009) Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd ed. Prentice-Hall.
  • Karwowski W., (2010) Ontologies and Agricultural Information Management Standards. Information systems in managment VI, ed. P. Jałowiecki & A. Orłowski, WULS Press, Warszawa 2010.
  • Lovins, J. (1968) Development of a Stemming Algorithm. Mechanical Translation and Computational Linguistics 11 (1-2), 11-31.
  • Manning C.D., (2011) Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? Computational Linguistics and Intelligent Text Processing, 12th International Conference, Proceedings, Part I. Springer LNCS vol. 6608, 171-189.
  • Manning C.D., Raghavan P., Schuetze H. (2008) Introduction to Information Retrieval, Cambridge University Press.
  • Paice C., Husk G., (1990) Another Stemmer, ACM SIGIR Forum 24(3). 56-61.
  • Porter, M. (1980) An algorithm for suffix stripping. Program 14(3), 130-137.
  • Wrzeciono P., Karwowski W. (2013) Automatic Indexing and Creating Semantic Networks for Agricultural Science Papers in the Polish Language, Computer Software and Applications Conference Workshops (COMPSACW), 2013 IEEE 37th Annual, Kyoto.

Document Type

Publication order reference

Identifiers

ISSN
2084-5537

YADDA identifier

bwmeta1.element.desklight-ae63f0e5-2a27-4a01-9d1a-9eaf8fd3505a
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.