Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

PL EN


2005 | 40 | 483-507

Article title

Corpora of Slavic languages

Selected contents from this journal

Title variants

Languages of publication

PL

Abstracts

EN
The aim of this paper is a presentation of corpora of Slavic languages. A corpus for almost every Slavic language either was compiled or shall be finished very soon. Some languages can be studied with help of several corpora. To the knowledge of the authors the exceptions are: Belorussian, Kashubian (if we agree that it is a language not a dialect) and Macedonian. The corpora are mostly accessible via Internet and meet the standards set by British National Corpus: their size ranges from 30 to 100 million running words, are balanced and morphosyntactically anotated. Interestingly, there is no interdependence between the position of a certain language and the quality of its corpus. Countries with relatively little population (e.g. Slovenia) can afford large and sophisticated corpora, while even if there are several corpora of Russian, none of them meets the standards which are nowadays required.

Year

Volume

40

Pages

483-507

Physical description

Document type

ARTICLE

Contributors

author
author
  • B. Chachulska, Instytut Jezyka Polskiego PAN, al. Mickiewicza 31, 31-120 Kraków, Poland

References

Document Type

Publication order reference

Identifiers

CEJSH db identifier
06PLAAAA01423156

YADDA identifier

bwmeta1.element.2fe4dc60-4c60-3bbe-a2b4-404968ae75ab
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.