Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Results found: 2

first rewind previous Page / 1 next fast forward last

Search results

help Sort By:

help Limit search:
first rewind previous Page / 1 next fast forward last
EN
In the article, three sources of corpus engineering are mentioned: (a) theoretical and descriptive achievements of structural linguistics, (b) the formal apparatus of generative theories, and (c) the development of computational tools. For the last decades, the Polish language has been satisfactorily accounted for both in terms of morphology and syntax. On that basis, two corpus search engines have recently been designed to annotate Polish text corpora (Poliqarp) or to disambiguate them morphologically (Holmes). The prospects of corpus engineering in Poland do not look optimistic, indeed. Unlike in neighbouring countries, not many people work in the area of computational linguistics. The article expresses the author's hope that young Polish linguists may find the job attractive, not only intellectually.
EN
Large text corpora management requires sophisticated computational tools. For highly inflecting languages like Polish homonymy is a challenge computer men have to face; in Polish texts, every 42nd word per 100 is grammatically ambiguous. A search engine 'Holmes', designed by Michal Rudolf, works as a disambiguator, rather than a tagger. It operates on texts which are morphologically marked before by special programs. After the user keyboards her query 'Holmes' examines sets of tags for each word, rejecting as many improper interpretations as possible. 'Holmes' makes use of linguistic, not statistical methods of disambiguation. It is based upon a number of rules formalizing various contextual restrictions on words. Query results are obtainable online.
first rewind previous Page / 1 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.