Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Results found: 4

first rewind previous Page / 1 next fast forward last

Search results

Search:
in the keywords:  reprezentativnost
help Sort By:

help Limit search:
first rewind previous Page / 1 next fast forward last
1
Content available remote

Nová koncepce synchronních korpusů psané češtiny

100%
EN
The paper describes the new corpus SYN2015, the most recent 100 million word corpus of contemporary written Czech. General notions of corpus representativeness and balance are discussed in this context with a focus on the new design of representativeness adopted for SYN2015. Unlike the previous synchronic corpora SYN2000, SYN2005 and SYN2010, which were balanced according to text reception (based on sociological surveys), the composition of SYN2015 is based on the “texts-as-products” principle with arbitrary proportions of the individual categories within a revised text classification scheme. The paper argues in favour of this solution by highlighting three major advantages: (1) this type of composition can be upheld constant in the future, ensuring corpus comparability, while reception changes constantly; (2) it emphasises diverse composition of the corpus as a language sample; (3) corpus SYN2015 serves not only as a representative sample, but also as a large pool of texts from which different subsets (subcorpora) based on various linguist-specified criteria can be drawn.
EN
Recently, more attention has been paid to the issues of corpus design and representativeness. These issues are especially important for general-purpose language corpora such as the spoken corpora developed within the framework of the Czech National Corpus. This text is a response to Jan Chromý’s paper “Comparison of spoken corpora from a sociolinguistic perspective” (Slovo a slovesnost 78, 2017: 145-158), in which the author compares the general-purpose spoken corpus ORAL2013 with his own dataset collected for the SAUP project. We argue that some of his claims are not justified by the findings presented in the paper and that his understanding of the concept of representativeness is rather misleading. Therefore, we aim to clarify some fundamental design decisions adopted for the compilation of ORAL2013 by responding to the specific objections raised by Chromý. We also point out some methodological and reasoning inconsistencies in his paper.
3
Content available remote

Překladová čeština v korpusech:

100%
EN
It is a well-established fact that corpus design, including its representativeness, has a major influence on any corpus research. The discussions usually cover the selection of text types or genres, or the choice of specific texts, while the issue of translations often remains unnoticed. The objective of this paper is to summarize approaches to translated texts in international corpus linguistics, to introduce the CNC corpus design regarding translations, and to present some examples which demonstrate that texts translated into Czech may differ from original Czech texts and thus affect the research results.
4
Content available remote

Korpus a reprezentativnost:

100%
EN
This paper discusses the concept of representativeness in corpus linguistics. Representativeness is a concept used in empirical, quantitative science and it is a characteristic of the relationship between the sample and the population. It is argued that the population for the standard supposedly “representative” corpora of a whole language cannot be defined. The population could be reliably defined only for specialized corpora (e.g. corpora of newspaper texts), hence only this type of corpora could be truly statistically representative. The paper also discusses the idea that we could think about representativeness from the perspective of particular linguistic items instead of from the perspective of the whole language. It may be the case that the same corpus is representative for the use of one item and, at the same time, not representative for the use of another item.
first rewind previous Page / 1 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.