KORPUS SLOVENSKÉHO COPYWRITINGU
Corpus of Slovak copywriting
Languages of publication
A new specialized sub-corpus of the Slovak National Corpus with a free public access – the Corpus of Copywriting Texts (cw-2014-all) consisting of 1 648 229 tokens and 54 617 unique lemmas was created in the department of the Slovak National Corpus of the Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences in 2014. The corpus contains 1 441 pages from 339 websites of commercial companies and public institutions focused on advertising and self-presentation. The corpus is lemmatized and morphologically annotated. It includes three sub-corpora: the sub-corpus of copywriting texts of bigger commercial companies (cw-2014-v); the sub-corpus of copywriting texts of smaller commercial companies (cw-2014-m) and the sub-corpus of copywriting texts of public institutions (cw-2014-inst). The article specifies the methodology of corpus creation and provides its basic quantitative and qualitative characteristics.
71 – 78
Publication order reference