Search results

1

Víceslovné lexémy v syntaktickém kontextu

100%

Rosen A., Skoumalová H., Znamenáček J.

Studie z aplikované lingvistiky - Studies in Applied Linguistics

|

2020

|

vol. 11

|

issue 2

63-84

EN

We start with the assumption that (i) a corpus represents the use of language, i.e. linguistic performance, (ii) a rule-based grammar represents language as a system, i.e. linguistic competence, and (iii) corpus annotation represents the interface between the two. To detect and diagnose mismatches between the language use and the language system we use a constraint-based grammar run as a constraint solver on texts tagged and dependency-parsed by stochastic tools. The texts also have MWEs (multi-word expressions) identified and transformed into a constituency-based format before the grammar is applied. We describe the role and results of the grammar, and its use to check texts annotated with morphosyntactic categories, syntactic structure and information about the status of relevant expressions as MWEs. The grammar also employs lexical resources such as a valency lexicon and a database of MWEs to make the checking more accurate and the annotation more informative. The results are represented as typed feature structures where MWE-related information can be shared by lexical and phrasal nodes. This allows for the annotation of MWEs as lexical units, independently of their analysis in terms of syntactic structure. Focusing on the interplay of MWEs with their syntactic context we analyse a number of representative examples, pointing out the pros and cons of specific solutions and the whole approach.

2

Inférences textuelles et constructions à verbes supports

100%

Banyś W.

Neophilologica

|

2022

|

vol. 34

1-37

EN

To understand natural language, automatic systems must have the ability and possibility to know what is inferred, and if so, how it is inferred, and from what in a text. The linguistically determined implications imposed by a predicate on its propositional arguments, appearing in a text as textual inferences, are, with other elements, such as knowledge of the world, speakers’ assumptions about language usage, situational stereotypes, implicatures, etc., one of the necessary elements for a system to be considered as understanding natural language. In this paper, Wiesław Banyś focuses on an important part of textual inferences, one which has been given hardly any scholarly attention so far, namely the relations between a particular category of textual inferences constituted by phrasal implicatives in the sense of Karttunen (2012) and a particular category of multi-word expressions constituted by support verbs. Banyś first gives a brief introduction to the current state of the art of the description of the two general categories: inferences, including a particular type of inferences: paraphrases (sec. 1), and multiword expressions with a particular type of these constructions: support verbs (sec. 2), which allows us to present the relations between support verbs and paraphrases (sec. 3). He then moves on to a discussion of implicatives, including phrasal implicatives (sec. 4), and presents a sketch of the description of some types of support verb constructions from the point of view of their implicative power (sec. 5). The analyses of the type presented are to be continued and need to be extended to all constructions of the type analysed and are at the same time part of a much more general project. They constitute the beginning of a systematic implementation to be done on French and Polish material, in correlation with the English, of the combinatorics of truth values, “implicative signatures”, of predicates, whether verbal, adjectival or nominal.

FR

Pour comprendre le langage naturel, les systèmes automatiques doivent avoir la capacité et la possibilité de savoir ce qui est inféré, et si oui, comment cela est inféré, et à partir de quoi dans un texte. Les implications linguistiquement déterminées imposées par un prédicat sur ses arguments propositionnels, apparaissant dans un texte comme des inférences textuelles, sont, avec d'autres éléments, tels que la connaissance du monde, les hypothèses des locuteurs sur l'utilisation de la langue, les stéréotypes situationnels, les implicatures, etc. l'un des éléments nécessaires pour qu'un système soit considéré comme comprenant le langage naturel. Dans cet article, nous nous concentrons sur l'une des parties importantes des inférences textuelles, qui n'a pratiquement pas été étudiée, à savoir les relations entre une catégorie particulière d'inférences textuelles constituée d'implicatifs phrastiques au sens de Karttunen (2012) et une catégorie particulière d'expressions polylexicales constituée de verbes supports. Nous donnons d'abord une brève introduction à l'état actuel de la description des deux catégories générales : les inférences, dont un type particulier d'inférences : les paraphrases (sec. 1), et les expressions polylexicales avec un type particulier de ces constructions : les verbes supports (sec. 2), ce qui nous permet de présenter les relations entre les verbes supports et les paraphrases (sec. 3). Nous passons ensuite à une discussion sur les implicatifs, y compris les implicatifs phrastiques (sec. 4), au sens de Karttunen (2012) et nous présentons une esquisse de la description de certains types de constructions à verbes supports du point de vue de leur pouvoir implicatif (sec. 5). Les analyses du type présenté sont à poursuivre et doivent être étendues à toutes les constructions du type analysé et font en même temps partie d'un projet beaucoup plus général. Elles sont un début d'une implémentation systématique à faire sur les matériaux français et polonais en corrélation avec l'anglais de la combinatoire des valeurs de vérité, "signatures implicatives", des prédicats, qu'ils soient verbaux, adjectivaux ou nominaux.

3

Web crawling dla celów lingwistycznych. Wybrane aspekty gromadzenia i analizy danych tekstowych na przykładzie rosyjskojęzycznych newsów internetowych

80%

Borysowski D.

Prace Językoznawcze

|

2021

|

vol. XXIII/3

87–104

PL

Autor niniejszego artykułu zgromadził ok. 2,7 mln rosyjskojęzycznych newsów internetowych. Zasadnicze cele tego tekstu stanowią: omówienie pojęcia web crawlingu w odniesieniu do pozyskiwania internetowych danych tekstowych, omówienie kwestii strukturyzacji takich danych w nieanotowanych korpusach tekstowych, a także przedstawienie wybranych aspektów analizy danych strukturyzowanych w ten sposób. Autor rozpatruje newsy internetowe jako połączenie tekstu zasadniczego oraz identyfikujących i charakteryzujących go metadanych (wyróżnionych podczas automatycznej ich ekscerpcji ze stron internetowych). Rozdział newsów na tekst zasadniczy i metadane stwarza możliwość przeprowadzenia ich analizy z dwóch perspektyw – tekstowej oraz metainformacyjnej (dodatkowo, np. w odniesieniu do badań chronologizacyjnych, z perspektywy uwzględniającej oba te poziomy). Zarys możliwych badań lingwistycznych zgromadzonego materiału uzupełnia autor ewaluacją wybranych wielowyrazowych całostek, wydobytych z tych tekstów z wykorzystaniem delimitacyjnej funkcji cudzysłowu.

EN

The author of the article collected nearly 2.7 million excerpts of Russian-language Internet news. The main objectives of the article include: discussing the concept of web crawling in relation to the acquisition of online text data, addressing issues related to structuring such data in unannotated text corpora, as well as presenting selected aspects of analyzing data structured this way. The author considers Internet news to be a combination of the main text and metadata that identifies and characterizes it (acquired during automatic extraction from websites). The categorization of news into the main text and metadata creates an opportunity to analyze it from two perspectives – textual and meta-information (and an additional perspective that combines these two, for example for the purpose of chronological studies). An outline of possible linguistic research into the collected material is supplemented with evaluating selected multi-word tokens extracted from these texts based on the delimitation function of quotation marks.

4

Exploring Phraseology in EU Legal Discourse

71%

Hrežo V.

Language. Culture. Politics. International Journal

|

2020

|

vol. 1

29-52

EN

The aim of the present article is to showcase EU legal discourse as a unique phenomenon of supranational specialized communication and on the basis of authentic data analysis identify specific lexical items with a focus on multi-word expressions while considering their structure and function in the analysed text. The present analysis consists in researching a selected monolingual EU Directive in its English language version while using a mixed method approach. The results of analysis indicate that the EU Directive analysed in the presented structural and functional study contains a large proportion of multi-word expressions distinctive for legal language while adhering to the specific distributional patterns regarding the different structural and functional categories of lexical bundles. The article also gives an overview of contemporary scholars’ research accomplished in institutional-legal discourse and translation.

PL

Celem niniejszego artykułu jest ukazanie dyskursu prawniczego, jaki ma miejsce w Unii Europejskiej, jako niezwykłego zjawiska międzynarodowej komunikacji specjalistycznej i, na podstawie analizy rzeczywistych danych, wyszczególnienie specyficznych jednostek leksykalnych, kładąc przy tym nacisk na wyrażenia wieloczłonowe, w trakcie badań struktury i funkcji analizowanego tekstu. Ukazana analiza obejmuje badanie wybranej anglojęzycznej Dyrektywy Unii Europejskiej, przy użyciu metody podejścia mieszanego. Wyniki przeprowadzonych badań wskazują, że analizowana pod kątem struktury i funkcji Dyrektywa unijna zawiera znaczną ilość wyrażeń wieloczłonowych charakterystycznych dla języka prawniczego, które nie zakłócają specyficznego układu syntaktycznego rozważanego z punktu widzenia występowania różnorodnych strukturalnych i funkcjonalnych kategorii leksykalnych. Ponadto, artykuł umożliwia wgląd we współczesne badania naukowe, jakie miały miejsce w sferze formalnego dyskursu prawniczego i translacji.

Refine search results

1 Language. Culture. Politics. International Journal

1 Neophilologica

1 Prace Językoznawcze

1 Studie z aplikované lingvistiky - Studies in Applied Linguistics

1 Banyś W.

1 Borysowski D.

1 Hrežo V.

1 Rosen A.

1 Skoumalová H.

1 Znamenáček J.

1 2022

1 2021

2 2020

Víceslovné lexémy v syntaktickém kontextu

Inférences textuelles et constructions à verbes supports

Web crawling dla celów lingwistycznych. Wybrane aspekty gromadzenia i analizy danych tekstowych na przykładzie rosyjskojęzycznych newsów internetowych

Exploring Phraseology in EU Legal Discourse