Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování)

Osolsobě, Klára

Article details

Journal

Časopis pro moderní filologii (Journal for Modern Philology)

2015 | 97 | 2 | 136-145

Article title

Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování)

Authors

Osolsobě Klára

Content

Full texts:

Download

Title variants

EN

CORPORA AS DATA SOURCES FOR THE UP-GRADING OF MORPHOLOGICAL TAGGING

Languages of publication

CS

Abstracts

EN

Adjectives ending with -oucí/-ící are regularly derived from verbs and hence are not usually listed in any of the Czech monolingual dictionaries. On the level of automatic morphological analysis (the dictionary) of Czech they should be generated from verbal roots and tagged as verbal adjectives (pos tag: AG.*). The data from Czech corpora prove a) inconsistencies in tagging and b) gaps in the dictionary. The main cause of both kinds of insufficiency is the existence of variants on the level of verbal forms from which the verbal adjectives are potentially derived. Consequently, text corpora are a significant source of knowledge about the formation and use of adjectives with endings -oucí/-ící that can be important for both a) automatic morphological analysis of Czech and b) theoretical description of Czech grammar (derivational morphology). Our goal is to present a corpus-based study of the Czech gerund, i.e. verbal adjectives with -oucí/-ící. The link between the inflected and the word-formation variants will be demonstrated using material from the SYN corpus (2,6 billion tokens of written Czech) and the large web corpus czTenTen12 (5,2 billion tokens of Czech text from the Internet — cleaned and deduplicated).

Keywords

CS

verbální adjektivum morfologické značkování automatická morfologická analýza varianta slovotvorba

EN

gerund/deverbal adjective pos tagging automatic morphological analysis variant derivational morphology

Year

2015

Volume

97

Issue

2

Pages

136-145

Physical description

Contributors

author

Osolsobě Klára

osolsobe@phil.muni.cz

Ústav českého jazyka, FF MU | Arna Nováka 1, 602 00 Brno

References

Document Type

Publication order reference

Identifiers

YADDA identifier

bwmeta1.element.desklight-a9c9dd4c-75cd-446d-a703-07b5f9a04901

Article details

Journal

Časopis pro moderní filologii (Journal for Modern Philology)

Article title

Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování)

Authors

Content

Title variants

Languages of publication

Abstracts

Keywords

Discipline

Publisher

Journal

Year

Volume

Issue

Pages

Physical description

Contributors

References

Document Type

Publication order reference

Identifiers

YADDA identifier