The paper presents a system for automatic content extraction from mammogram reports written in Polish. The system combines general information extraction (IE) techniques with external post-processing aimed at structuralizing the results. The paper contains a characteristics of the specific type of texts as well as a description of the results obtained together with a short analysis of advantages and disadvantages of shallow text processing.
M. Marciniak, Instytut Podstaw Informatyki PAN, ul. J. K. Ordona 21, 01-237 Warszawa, Poland
Publication order reference
CEJSH db identifier