Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Refine search results

Journals help
Years help
Authors help

Results found: 99

first rewind previous Page / 5 next fast forward last

Search results

Search:
in the keywords:  corpus
help Sort By:

help Limit search:
first rewind previous Page / 5 next fast forward last
EN
SYNAMET - A Microcorpus of Synesthetic Metaphors. Preliminary Premises of the Description of Metaphor in DiscourseThis article describes the preliminary premises of metaphor annotation in SYNAMET - the developing microcorpus of synesthetic metaphors. The analysis is based on the CLST theory (Context-Limited Simulation Theory) put forward by D. Ritchie. According to this theory, the metaphor’s vehicle may activate various types of associations between words: semantic relations, perceptual sensations, or emotional simulations. The range of potential associations evoked by the vehicle is limited by the topic, i.e. the lexical context in which the metaphor appears. The relations between the vehicle and the topic may be presented in the form of a semantic frame.To reconstruct the frames within the project, linguistic works devoted to sensory perception- vision, hearing, smell and taste- will be utilized. The corpus annotation will consist of the following stages: 1) metaphor identification, 2) indication of the metaphor cluster (CM) - a phrase or a passage of the text, centered around one referent, 3) isolation of the metaphorical units (MU) - word forms or phrases combining lexemes primarily belonging to different perceptual frames.The outcome of the MU analysis will include: a general metaphorical scheme of the MU, lexical items activating the frame of the MU (together with their grammatical description), a detailed metaphor scheme of the MU, and the semantic and grammatical categorization of the MU. SYNAMET – mikrokorpus metafor synestezyjnych. Wstępne założenia opisu metafory w dyskursieArtykuł opisuje wstępne założenia anotacji metafor w powstającym mikrokorpusie metafor synestezyjnych SYNAMET. Podstawą metody opisu będzie teoria CLST (Context-Limited Simulation Theory) D. Ritchie’go. W myśl tej teorii nośnik metafory (vehicle) może aktywować różne typy powiązania między wyrazami: semantyczne, zmysłowe lub emocjonalne. Potencjalny zakres powiązań nośnika ogranicza topik (topic), czyli kontekst, w którym metafora się pojawia. Powiązania nośnika oraz topiku przedstawia się w postaci ram interpretacyjnych.W rekonstrukcji ram na potrzeby korpusu wykorzystane zostaną prace językoznawcze poświęcone percepcji zmysłowej: wzrokowi, słuchowi, zapachowi, smakowi. Anotacja korpusu będzie przebiegać według następującego schematu: 1) identyfikacja metafor, 2) wyodrębnienie w tekście układu metaforycznego (UM) – frazy lub fragmentu tekstu, zorganizowanego wokół jednego referenta, 3) wyodrębnienie jednostek metaforycznych (JM) – form wyrazowych lub fraz, w których występuje połączenie leksemów przynależnych prymarnie do różnych ram percepcyjnych.Wyniki analizy JM zostaną przestawione w postaci: ogólnego schematu metaforycznego, zestawu wyrazów aktywujących ramy (wraz z ich opisem gramatycznym), szczegółowego schematu metaforycznego, kategoryzacji semantycznej i gramatycznej metafor.
XX
This paper conducts a corpus-based study of the occurrence/non-occurrence, structural pattern, and forms of the premodifi er in the Nigerian English noun phrase, comparing the scenarios that emerge with those of the British and Ghanaian varieties of English. These three phenomena, which are crucial to the nature of premodifi er in new varieties of English, are investigated in relation to predictors representing syntactic function, register, post-dependent syntactic weight, and animacy, showing, among other things, the extent to which structural complexity/simplicity is present in the structure of the premodifi ers studied. Corpus fi ndings indicate that premodifi ers are more likely to occur (53%) than not (47%) and that simple premodifi ers (i.e. one-word premodifi er structural pattern (79%)) are signifi cantly preferred to complex premodifi ers (i.e. two-word at 17% and longer patterns at 4%). Relating to form, single premodifiers are most likely to be realized as adjectives. It is also found that the alternation between simple and complex premodifi ers is most strongly predicted by the syntactic functions that the NP performs, as well as the syntactic weight present in the post dependent slot. Register, which is reputed as a very strong indicator of structural variation (Schils and De Haan 1993; Biber et al. 2007; Schilk and Schaub 2016) is outweighed by syntactic function and post-dependent weight.
EN
Multiword Units are an inseparable part of learning foreign languages. Routineforms, collocations and idioms are being researched in many studies. The foreign language didactics is focused on optimizing language acquisition of multiword units. In this article will be presented the international project “PhraseoLab –Learning multiword units through English”, which purpose is to share an Open Educational Ressource for learners, who already have gained adequate English skills to use them in learning German. An important issue by formulating PhraseoLab teaching materials is the selection of multiword units. Only these routineforms, collocations and idioms, which are frequent in spoken German and are essential for the learning person will be consulted in future PhraseoLab tasks. In second part of the article will be shown the results of the corpus survey, which researched the frequency of about 1100 idioms by using DGD Mannheim corpus taking into consideration the spoken language. The investigation, which is presented in the empirical part was carried out by the author of this article and Sulikowska about 20 years after the publication of “Phraseologisches Optimums” by Hallsteinsdóttir, Sajánková and Quasthoff. The goal was to become newer and current results. At the end are highlighted 30 idioms, which frequency in spoken German from DGD corpus is the most common.
EN
Phonological free variation describes the phenomenon of there being more than one pronunciation for a word without any change in meaning (e.g. because, schedule, vehicle). The term also applies to words that exhibit different stress patterns (e.g. academic, resources, comparable) with no change in meaning or grammatical category. A corpus-based analysis of free variation is a useful tool for testing the validity of surveys of speakers' pronunciation preferences for certain variants. The current paper presents the results of a corpus-based pilot study of American English, in an attempt to replicate Mompéan's 2009 study of British English.
|
2013
|
vol. 11
|
issue 3
251-276
EN
In this article, we discuss strategies for interaction in spoken discourse, focusing on ellipsis phenomena in English. The data comes from the VOICE corpus of English as a Lingua Franca, and we analyse education data in the form of seminar and workshop discussions, working group meetings, interviews and conversations. The functions ellipsis carries in the data are Intersubjectivity, where participants develop and maintain an understanding in discourse; Continuers, which are examples of back channel support; Correction, both self- and other-initiated; Repetition; and Comments, which are similar to Continuers but do not have a back channel support function. We see that the first of these, Intersubjectivity, is by far the most popular, followed by Repetitions and Comments. These results are explained as consequences of the nature of the texts themselves, as some are discussions of presentations and so can be expected to contain many Repetitions, for example. The speech event is also an important factor, as events with asymmetrical power relations like interviews do not contain so many Continuers. Our clear conclusion is that the use of ellipsis is a strong marker of interaction in spoken discourse.
EN
Phonological free variation describes the phenomenon of there being more than one pronunciation for a word without any change in meaning (e.g. because, schedule, vehicle). The term also applies to words that exhibit different stress patterns (e.g. academic, resources, comparable) with no change in meaning or grammatical category. A corpus-based analysis of free variation is a useful tool for testing the validity of surveys of speakers' pronunciation preferences for certain variants. The current paper presents the results of a corpus-based pilot study of American English, in an attempt to replicate Mompéan's 2009 study of British English.
Research in Language
|
2013
|
vol. 11
|
issue 3
251-276
EN
In this article, we discuss strategies for interaction in spoken discourse, focusing on ellipsis phenomena in English. The data comes from the VOICE corpus of English as a Lingua Franca, and we analyse education data in the form of seminar and workshop discussions, working group meetings, interviews and conversations. The functions ellipsis carries in the data are Intersubjectivity, where participants develop and maintain an understanding in discourse; Continuers, which are examples of back channel support; Correction, both self- and other-initiated; Repetition; and Comments, which are similar to Continuers but do not have a back channel support function. We see that the first of these, Intersubjectivity, is by far the most popular, followed by Repetitions and Comments. These results are explained as consequences of the nature of the texts themselves, as some are discussions of presentations and so can be expected to contain many Repetitions, for example. The speech event is also an important factor, as events with asymmetrical power relations like interviews do not contain so many Continuers. Our clear conclusion is that the use of ellipsis is a strong marker of interaction in spoken discourse.  
Linguistica Pragensia
|
2019
|
vol. 29
|
issue 1
100-120
EN
The central topic of the paper is interpolation (clitic-verb non-adjacency) in Classical and early Modern European Portuguese (EP). In that period, the não negative marker was the only expression eligible to break the continuity of clitic-verb sequences. The aims of the study are twofold. First, previous assumptions on the syntax of this linear model are matched against corpus data. The present analysis demonstrates, first, that interpolation was allowed outside obligatory proclisis contexts. They correspond to the presence of the não negative in four structural positions where enclisis and proclisis were freely interchangeable in previous stages and where enclisis is nowadays mandatory. The second aim is to account for the overrepresentation, underpinned by corpus data, of 3rd person direct object pronouns in sequences with interpolation. Interpolation is claimed to have enabled speakers to get rid of morpho-phonological ties between the o, a, os, as series and the preceding non-verbal sound material (nasal diphthongs in não, quem, etc, coercing pronouns into taking a nasal onset quem no, nãono, etc.). As a consequence, in contemporary standard EP, clitic-specific allomorphy is earmarked for enclisis.
PT
O assunto central é a interpolação, isto é a falta da adjacência clítico-verbo no português clássico e no início da fase moderna. Naquela altura, o marcador da negação predicativa foi a única expressão capaz de alterar a continuidade da sequência clítico-verbo. No artigo pretendem-se alcançar dois objetivos. O primeiro é o confronto dalgumas afirmações prévias acerca da sintaxe desse modelo linear com os dados extraídos de um corpus. A análise demonstra que, ao longo do período discutido, a interpolação se verificava igualmente fora dos contextos de próclise obrigátoria. As suas ocorrências atestam-se em quatro posições estruturais onde imperava a variação livre: ênclise — próclise e onde a ênclise é obrigatória hoje em dia. O segundo objetivo relaciona-se com a sobre-representação dos clíticos objeto direto da 3ª pessoa nas sequêncas com a interpolação. Demonstra-se que a não-adjacência clítico-verbo pôs termo aos laços morfo-fonológicos que uniam o pronome a certos proclisadores a ele antepostos (os ditongos nasais em não, quem fazendo com que a sequência se revestisse, por vezes, da forma quem no, não no). Uma vez invertida a ordem das sílabas, deixou de poder ocorrer na posição pré-verbal a alomorfia específica aos clíticos. Assim, as alterações específicas à morfologia pronominal só se manifestam na ênclise no PE atual.
EN
The aim of this paper is to demonstrate and question the exploitation of corpus, databases and dictionaries in the production of a bilingual dictionary of Spanish-Croatian proverbs, discussing on the one hand the well-known incongruities between pragmatics and lexicography (Do we actually use what is in the dictionary? In the case of variants, what is the canonical entry? Can we consider the proverbs with similar meaning as synonyms?), and on the other, considering the principles of contrastive paremiography (paremiological equivalents, types of equivalence, paremiological false friends).
ES
El objetivo del presente trabajo es demostrar y cuestionar la explotación de los corpus, bases de datos y diccionarios en la elaboración de un diccionario bilingüe de refranes español-croata comprendiendo por una parte las consabidas incongruencias entre la pragmática y la lexicografía (¿se utiliza de verdad lo que está en el diccionario?, en el caso de variantes, ¿cuál es el lema “canónico”?, ¿son sinónimos los refranes de significado parecido?), y por otra, considerando los principios de paremiografía contrastiva (equivalentes paremiológicos y tipos de equivalencia, falsos amigos paremiológicos).
EN
In addition to the resources made available by the Galician and Portuguese Philology unit of the University of Salamanca, the new corpus of Spanish-Portuguese Literature of the Fundación Biblioteca Virtual Miguel de Cervantes, led by José Miguel Martínez-Torrejón (Queens College CUNY), provides researchers and university students open access to a wide catalogue of Portuguese authors who wrote their literary works in Spanish between the sixteenth and seventeenth centuries, as well as a bibliographical database concerning a generally unknown area of peninsular literature.
ES
En complemento a los recursos disponibilizados por el área de Filología Gallega y Portuguesa de la Universidad de Salamanca, el nuevo corpus de Literatura Hispano-Portuguesa de la Fundación Biblioteca Virtual Miguel de Cervantes, dirigido por José Miguel Martínez-Torrejón (Queens College CUNY), ofrece a investigadores y estudiantes universitarios acceso abierto a un amplio catálogo de autores portugueses que escribieron su obra literaria en lengua castellana entre los siglos XVI y XVII y una base de referencias bibliográficas sobre un área generalmente desconocida en el ámbito de las letras peninsulares.
EN
Based on the analysis of the diachronic data in the Czech National Corpus, the paper aims to specify some of the possible functions that the particle -ť expressed in Czech written texts in the 14t–18th centuries. The study combines a quantitative and a qualitative approach. The first part presents a quantitative analysis based on a smaller corpus that is balanced in terms of time and genre. As a result of this analysis, the functions manifested by the -ť particle in a balanced corpus are specified. In the second part of the study, the dative function of the particle -ť is analysed using a much larger, yet unbalanced corpus. The third part of the paper includes a qualitative analysis based on selected texts that verifies the existence of a dative function of the -ť particle.
EN
The main purpose of this article is to reconstruct the linguistic stereotype of the macho in the Mexican Spanish. The applied methodology is a qualitative analysis of various corpora: Corpus del Proyecto para el estudio sociolingüístico del español de España y de América PRESEEA, Corpus de las Sexualidades de México, Corpus Histórico del Español en México, Corpus del Español Mexicano Contemporáneo CEMC, Corpus de Referencia del Español Actual de la Real Academia Española CREA and Corpus de Español del Siglo XXI CORPES XXI. As a result, we have a description of how the macho should look like, behave, treat women and other men. However, the results prove also that the word “macho” is rarely used by the Mexicans and this conclusion requires further investigation.
EN
The article is an attempt to reconstruct the field of German-language discourse studies and to analyse them critically. Owing to very strong references, in the discourse analysis models examined in the article, to Michel Foucault’s concept, they are regarded as post-Foucault. The authors present the main threads in German-language discourse studies: (1) approaches the objective of which is to formulate a theoretical-methodological basis of a post-Foucault discourse analysis (these are primarily “discipline-specific” schools of discourse analysis: linguistic and sociological, as well as the programme of the so-called critical discourse analysis); (2) “dispositive” approaches, which constitute a novelty in the debate over the discourse category and regard the dispositive category as a possibility of finding a supradiscursive “system.” The authors also reflect on the critical remarks about the various threads in the studies, including those formulated by scholars themselves. The main conclusion from the authors’ reconstruction is that there is a tendency in German-language discourse studies to understand the category of discourse quite narrowly, with regard to specific disciplines, and thus that there is a lack of an integrated and interdisciplinary model of discourse analysis.
EN
The article raises the problem of a possibility of carrying out discourse analysis by means of corpus-based and quantitative methods. Discourse studies use various research methods, which are mostly qualitative. Corpus-based and statistics-based linguistics, on the other hand, offer many tools that can be used to study discourse, beginning with electronic concordance and ending with calculations making it possible to discover similarity in texts and genres, marked lexis, key words, etc. The author briefly describes these methods in the article. She also provides basic information about the structure of a specialist corpus and the selection of a representative sample of texts, which constitute a given discourse.
EN
Based on the written part of the British Component of International Corpus of English (ICE-GB), this paper investigates the interrelationship between length and complexity of sentential constituents and their positions in the sentence. Results show that length and complexity affect sentential constituent ordering. Within the sentence, the longest and the most complex constituents tend to occur in the final position, and the relatively shorter and less complex constituents tend to be in the initial position. However, for sentential constituents in other positions, the length-complexity-position relationship appears to be random. Possible explanations for the findings are provided from different perspectives, especially from the distribution of given and new information.
Linguaculture
|
2012
|
vol. 2012
|
issue 2
101-115
EN
Starting from a parallel corpus of general use texts, this article investigates what kind of regularities are discernible in the formation of the terms used in the Romanian language of information and communication technology (ICT). After a brief presentation of the corpus that supported this research, the article begins with an introduction to the distinction made between the processes of primary and secondary term formation and considers it in relation to the concepts of translation regularities and norms as theorized by Gideon Toury. Starting from a concise examination of the sentence-based turn in translation studies, the final part of the article analyzes the main strategies used in the secondary formation of Romanian ICT terms (borrowing, loan translation, hybrid formation, and translation proper) and attempts to determine which of them could be seen as regularities that ampler studies could confirm as norms in this process.
17
Content available remote

Noun distribution in natural languages

88%
EN
Previous research on word class distribution claimed that 37% of word tokens are nouns, suggesting that there might exist a certain regularity of noun proportion among human languages. To explore this possibility, we examined the proportion of noun and four other word classes within British and American English, and across seven languages in terms of different word frequency band. Results indicated that the noun proportion is evidently about or larger than 37%, and meanwhile increases with word rarity. Among frequent words, nouns increase as minor word classes decrease, whereas among rare words, the noun proportion remains a stable level.
EN
Lexical Means in Communicating Emotion in Suicide Notes - on the Basis of the Polish Corpus of Suicide NotesPolish Corpus of Suicide Notes (PCSN) is a relatively large set of authentic suicide notes that are linguistically annotated on several levels. In order to identify features characteristic for this genre we compared PCSN with the collected subcorpus of counterfeited suicide notes. In this paper we focus on the lexical means of expressing emotions. Our goal was to analyse ways of expressing emotions in this specific genre. Our initial list of lexical markers was based on Markowski’s list of the lexis common for different genres. The list was next expanded with the help of the plWordNet 2.0 - a lexico-semantic network. The expansion was based on the manually selected noun and verb hypernymy branches according to their correspondence to the elements of the initial list. For words from the extended list, a quantitative analysis was performed for both authentic and fake suicide notes. We have also analysed the use of the lexical markers of emotions, feelings and emotional states, as well as emotion operators, and ways of expressing personal evaluation, affection and hate.
Linguistica Pragensia
|
2021
|
vol. 31
|
issue 2
137-160
EN
Single postpositive adjectives as a minor type of noun postmodification in English are surveyed using a corpus sample to assess their retrievability, to provide an overview of the morphematic types, and their register distribution. Applying an ancillary test of positional mobility to characterize the postpositive occurrences, four broad groups are delimited. Postmodification by a-adjectives and adjectives in terminological compounds is infrequent, whereas the majority of the sample is constituted by adjectives in -able/-ible and in -ed, whose post-head position involves changes in semantic meaning, as well as occurrences which cannot be accounted for by the constraints stipulated in grammars. The frequency and patterning of available, responsible or necessary call for a re-evaluation of the function of postposition, encompassing all frequent forms and taking into account reference and other textual factors.
PL
The article describes an extraction method of Russian verbal-nominative structures from electronic texts with the use of grammatical annotation and regular expressions syntax. The stages of the retrieval process are outlined. The list of extracted structures can be subsequently verified for the presence of reproducible units (phrasems).
first rewind previous Page / 5 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.