EN
Linking Estonian linguistic proficiency to reference levels of the CEFR and different educational stages does not rely on research but is based on deep-rooted perceptions. More veracious data can be obtained by comparing a native speaker’s language usage patterns to morphological and lexical preferences characteristic to speakers of every language level. For this purpose, tools for automatic text processing (which are mainly created on the basis of English) and different techniques for data analysis are needed. The article introduces an original computer program called Cluster Catcher that has been developed in the Tallinn University for finding usage patterns from Estonian written language texts.