It is only in the new era of large electronic corpora that some low-frequency grammatical structures can be tested for their communicative as well as their systemic status. This is also the case of the complex predicate of the type 'mit (to have) + abstract noun' (e.g. 'mit zkusenost', to have experience) in the sense of 'byt zkuseny' (to be experienced) in contemporary Czech. The expression 'v sobe' (in oneself), if attached to the predicative syntagma of the type 'mit (to have) + abstract noun' can make this complex predicate acceptable and grammatical.
This article investigates the statistic function mutual information (MI-score) as applied to various types of lexical combinations, such as multi-word proper names, multi-word terms, idioms and systemic and textual collocations. As a basis for comparison, combinations of the following parts of speech were considered: group A - noun+noun, group B - adjective+adjective, group C - adverb+adverb, group D - adjective+adverb, group E - adjective+adverb, group F - preposition+noun and group G - noun+preposition. MI-scores with values of 15-23 highlight different types of lexical collocations: proper names in group A, terms in group B, idioms in group C, terms in group D and terms in group E. MI-scores with values of 7.0-7.1 relate mostly to systemic or textual collocations. Prepositional collocations are evaluated in the range of the MI-scores 17.5-7.0. The most typical collocations are systemic-textual collocations for group F and multi-word proper names for group G.
This article discusses some aspects of searching for grammatical information in corpora. It argues that any search procedure must consist of at least three principally different steps. First, a hypothesis regarding some grammatical property of the language system must be formulated in terms of an available 'tagging' menu. Second, general instructions concerning the sample size, relevant context size, etc. must be stated, and only then can the third step, i.e. the proper search and interpretation of the attested data, be taken. Examples from the Czech National Corpus are offered to show that the boundary between grammaticality and non-grammaticality of a phenomenon or category is represented by a probability scale with more than just two opposing values and that the corpus may serve as an important tool for locating the most probable (favorite) point on the scale. The issue of zero or non-zero occurrence of a phenomenon is discussed in greater detail. It is argued that if no example of a phenomenon is attested in the corpus, it does not necessarily follow that the corpus is too small and that it is necessary or significant to intervene in favor of a larger one.
The paper presents the annotation of a Slovene language corpus at the semantic level. Manual annotation was performed in two cycles with an automatically generated semantic lexicon according to the wordnet model. The analysis of the results shows that nearly all polysemous words in the corpus can be assigned a sense from our wordnet but also that the task was quite challenging; in many cases, wordnet sense distinctions are too fine-grained even for human annotators to distinguish between them. This is why annotation with more coarse-grained senses could prove to be more successful.
This paper describes how the Czech National Corpus can be used in researching what is called the 'loss of biaspectuality' process. We have tried to establish that the process manifests itself as a gradual year-by-year decrease in frequency of foreign-origin biaspectual verbs in (sub)corpora with a balanced distribution of texts. We deem this decrease to be inversely proportional to the increase in frequency of newly formed perfective correlates. Though these hypotheses have not been unambiguously verified, we have nonetheless acquired a great quantity of valuable data. We have identified four general types of verbal correlation in our database: (1) the 'akceptovat' type stands for biaspectual verbs with no identified perfective correlates in the Czech National Corpus; (2) the 'ilustrovat' type includes biaspectual verbs with infrequent perfective correlates; (3) the 'dokumentovat' type represents biaspectual verbs with moderately frequent perfective correlates (both components of the correlation can ordinarily express perfective meaning); and finally, (4) the 'likvidovat' type stands for originally biaspectual verbs with very frequent perfective correlates with a tendency to become the exclusive vehicle of the perfective meaning.
This paper has a double aim: i) to empirically establish the functions naturally expressed by any and those of one of its Spanish counterparts, specifically cualquier(a) so as to identify possible cross-linguistic transfer; ii) to illustrate a high-performing methodological procedure. A set of tools, among them an ad hoc tertium comparationis consisting of a set of cross-linguistic labels, a parallel corpus (P-ACTRES) and a reference corpus (CREA) are used to explore: i) the uses of any in context and the resulting translation environments served by cualquiera; ii) the degree of matching between translated cualquier(a) and its non-translated usage in standard European Spanish. The corpus-based procedure follows basically Krzeszowski's contrastive model (1990) with the addition of a 'target language fit' stage (Chesterman, 2004). The analysis shows different behaviour in translated and nontranslated Spanish: Cualquier(a) is underused as a translation option for 'existential' any and acquires a new function, 'negative', which is not a possibility in non-translated language for the same contexts. The analysis also corroborates the usefulness and the replicability of the methodological procedure.
According to the current research the fixedness of phrasemes is understood as relative. The study investigates the processing of creative idiomatic variability in Slovak and Russian languages. The authors identify various types of structural-semantic modifications on the example of the only one Slovak phraseme urobiť/spraviť capa záhradníkom in comparison with its Russian equivalent пустить козла в огород, which occur in a media text.
(Polish title: Polsko-litewskie badania konfrontatywne Zespolu Semantyki Instytutu Slawistyki Polskiej Akademii Nauk). The dissertation discusses the research area of the Lithuanian-Polish theoretical contrastive studies, conducted by the Department of ISS PAS, including.: [1] Lithuanian-Polish theoretical contrastive studies utilizing the interlanguage (Lithuanian-Polish contrastive grammar), [2] Terminological dictionary (definitions of the interlanguage particular sections used in the linguistic contrast along with terminology suggestion for Slavic languages, including Lithuanian), [3] Corpus linguistics and on-line dictionaries (experimental trilingual Bulgarian-Polish-Lithuanian parallel and comparable corpora, experimental corpus of the Lithuanian local dialect of Punsk and experimental electronic Polish-Lithuanian dictionary), [4] Research on the Lithuanian local dialect of Punsk in Poland (grammatical description of the local dialect on the basis of the experimental dialectal corpus in contrast with Polish and Lithuanian).
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.