The paper presents a simple semi-automatic neologism detection procedure: a trivial Python script processes a text file, making use of a Czech morphological tagger, and extracts all words unrecognized by the tagger as potential neologisms. The list of these candidates has to be checked by a human (hence the label semi-automatic). This method was applied to a set of texts that were also analyzed in a more traditional way, by the “reading and marking” technique (i.e. the current practice). The comparison of the two methods has revealed that the semi-automatic procedure clearly outperforms the current practice both in speed and in efficiency.
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.