Results found: 2

Search results

Search:
in the keywords: tokenizace

Sort By:

Limit search:

Homonymie mezi apelativy a proprii jako problém automatické morfologické analýzy češtiny

100%

Osolsobě K., Žižková H.

Acta onomastica

2020

vol. 61

issue 1

161-174

The aim of this paper is to provide a corpus-based analysis of one type of Czech proper nouns (type Zubří). We will argue that the adequate annotation (lemmatisation and morphological tagging) of proper nouns type Zubří depends on several circumstances: 1) the coverage of the dictionary of the automatic analyser; 2) the accurate description of the variability of inflexion forms; 3) the non-trivial disambiguation of numerous homonymous word forms. We believe that while meeting the first two conditions is possible, the adequate disambiguation goes beyond the possibilities of automatic morphological analysis.

Změny v morfologické anotaci korpusů řady SYN: nové možnosti zkoumání české gramatiky a lexikonu

100%

Křivan J., Šindlerová J.

Slovo a slovesnost: časopis pro otázky teorie a kultury jazyka (Slovo a slovesnost: A journal for the theory of language and language cultivation)

2022

vol. 83

issue 2

122-145

This paper introduces some major conceptual enhancements to the morphological annotation of the SYN series corpora of the Czech National Corpus. Apart from minor changes in tokenization and in the positional tagset, three major conceptual changes have been applied which affect the representation of various lexical and grammatical patterns. In the paper, we present the actual impact of the changes in linguistic data and search for possibilities in three linguistic areas. First, the treatment of phonic, graphemic, and morphological variants via a two-tier lemma structure is discussed; second, a new approach to periphrastic verb forms, auxiliaries, participles and the interpretation of verbal grammatical categories through a new attribute, called verbtag, is explained; and third, a complex multi-value treatment of multiword tokens is introduced.

Refine search results

1 Acta onomastica

1 Slovo a slovesnost: časopis pro otázky teorie a kultury jazyka (Slovo a slovesnost: A journal for the theory of language and language cultivation)

1 Křivan J.

1 Osolsobě K.

1 Šindlerová J.

1 Žižková H.

1 2022

1 2020

Search results

Homonymie mezi apelativy a proprii jako problém automatické morfologické analýzy češtiny

Změny v morfologické anotaci korpusů řady SYN: nové možnosti zkoumání české gramatiky a lexikonu