Results found: 3

Search results

Search:
in the keywords: morphological tagging

Sort By:

Limit search:

Korpus českého jazyka 2. poloviny 19. století

100%

Kučera K., Najbrtová K., Pivoňková K., Řehořková A., Stluka M.

Časopis pro moderní filologii (Journal for Modern Philology)

2019

vol. 101

issue 1

92-98

The paper describes the principles and structure of the one-million-word DIA1900 Corpus built at the Institute of the Czech National Corpus (CNC) in Prague, focused on the language of Czech texts published in the years 1851 to 1900. The DIA1900, planned for publication by June 2020 and to be followed by the DIA1850 (a corpus built around the same principles, with the focus on the first half of the 19th century), observes both the balanced representation of the three major text types (belles lettres — journalistic texts — technical/scientific texts) and the system of morphological tagging implemented in the synchronic corpora included in the CNC project, thus facilitating the diachronic comparison of two stages in the development of Czech. A brief description is given of the structure of the morphological terminology used in the lemmatisation and tagging of the corpus, and of two tools designed to help search the 19th century texts with their fluctuating orthographic consistency combined with phonological and morphological variation characteristics of the language of the period: (1) a multiple select/suggest feature (reminding the user of the existence of non-standard orthographic and phonological variants of the lemma found in the corpus before the lemma search is started) and (2) the position attribute (informing the user of the ambiguous status of a word in the text, resulting from a misprint or misspelling, damaged page etc.).

Hluboké učení v automatické analýze českého textu

100%

Straková J., Straka M., Hajič J., Popel M.

Slovo a slovesnost: časopis pro otázky teorie a kultury jazyka (Slovo a slovesnost: A journal for the theory of language and language cultivation)

2019

vol. 80

issue 4

306-327

The deep learning methods of artificial neural networks have seen a significant uptake in recent years, and have succeeded in overcoming and advancing the success of auto-solving tasks in many fields. The field of computational linguistics and its application offshoot, natural language processing, with classic tasks such as morphological tagging, dependency analysis, named entity recognition and machine translation, are no exception to this. This paper provides an overview of recent advances in these tasks related to the Czech language and presents completely new results in the areas of morphological marking and recognition of named entities in Czech, along with a detailed error analysis.

Rezultativa se jmenným a složeným tvarem n-/t-ového příčestí. Výsledky z ČNK po zavedení nového morfologického značkování

100%

Giger M.

Naše řeč (Our Speech)

2022

vol. 105

issue 2

78-87

Until recently, the full form of the n-/t-participle was tagged in the Czech National Corpus as a common adjective. Only with the new corpus SYN2020 a special tag was introduced. This allows for research the role of both the short and the full form of the n-/t-participle with resultatives in written Standard Czech texts. The results show that the full form of the participle has in most contexts a significantly higher frequency than the short form. The only excerptions are subject and object resultatives without subject (Je zataženo / Je otevřeno) and possessive resultatives without object (Mají zavřeno), both with the participle in the neuter singular form. In these cases the full forms seldom occur in actual written Czech texts. The use of the new tag in other corpora than SYN2020 will allow for better research of full forms of the n-/t-participle in Czech, not only in resultative constructions.

Refine search results

1 Naše řeč (Our Speech)

1 Slovo a slovesnost: časopis pro otázky teorie a kultury jazyka (Slovo a slovesnost: A journal for the theory of language and language cultivation)

1 Časopis pro moderní filologii (Journal for Modern Philology)

1 Giger M.

1 Hajič J.

1 Kučera K.

1 Najbrtová K.

1 Pivoňková K.

1 Popel M.

1 Stluka M.

1 Straka M.

1 Straková J.

1 Řehořková A.

1 2022

2 2019

Search results

Korpus českého jazyka 2. poloviny 19. století

Hluboké učení v automatické analýze českého textu

Rezultativa se jmenným a složeným tvarem n-/t-ového příčestí. Výsledky z ČNK po zavedení nového morfologického značkování