Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Refine search results

Results found: 1

first rewind previous Page / 1 next fast forward last

Search results

Search:
in the keywords:  POS TAGGING
help Sort By:

help Limit search:
first rewind previous Page / 1 next fast forward last
EN
Lemmatization and morphological tagging is an indispensable step in Slovak corpus linguistics. In this article, we evaluate two state-of-the-art Slovak language lemmatizers and MSD taggers. One is based on MorphoDiTa and the other is based on spaCy. We measured accuracy on the test subset of manually lemmatized and MSD annotated corpus and found that the combination of lemma and tag achieved 93.5% accuracy with MorphoDiTa, and 95.6% accuracy with spaCy. Most of the errors occurred in disambiguating MSD tags for homonymous uninflected parts of speech such as particles, conjunctions, and adverbs, and in disambiguating singular masculine inanimate nominative and accusative. In these cases, spaCy shows a noticeable improvement over MorphoDiTa, likely due to a better exploitation of the context of the words.
first rewind previous Page / 1 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.