In order to develop balanced corpora, the term 'expectations' of the future potential user of corpora has been introduced (Kralik, 2001). Based on several statistical studies of such expectations, the textual structure of SYN2000, which is the synchronic part of the Czech National Corpus (CNC) has been proposed and realized. The present article discusses two new studies of expectations (Akter 2001 and CJ 2001) and suggests important implications for future work on CNC. Table 1 and Table 2 reveal the stability of expectations in the categories of fiction (krasna literatura) and newspapers and magazines (noviny + casopisy). Although the daily contact between respondents and administrative texts is stable (see Table 3), the distribution of these texts is closely bound to other non-fiction topics, which is why no special attention to administrative texts is proposed. The expectations concerning newspapers and magazines are stable (Table 5), but changed radically during 1996-2001 (first and last searches, Table 6). Within the same period, an obvious rise in interest in fiction has been noted (Table 6). The reasons for this can be attributed to natural societal development. Thus, a strong reduction in newspaper texts and strong increase in the use of fictional texts is proposed (Table 7 + Table 8).
The joint project of Hungarian Academy of Sciences in Budapest and the Czech Academy of Sciences in Prague Computational Lexicology and Dialogue Research has inspired not only specific approaches to new linguistic research, but has also directed attention toward the history of Hungarian and Czech linguistic description. Some previously hidden parallels in lexicography, grammar and corpus projects have been discovered and discussed. In this paper, an overview of main similarities in the phases of cultivation of these two languages reveals, among others, the important unifying role of the European style of education and scholarly work. In addition, a brief historical outline shows Czech and Hungarian as the subject of linguistic research with similar positions and as solving their specific problems in historical parallels. This information enables the depiction of new projects in corpus linguistics in a broader historical context.
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.