Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Filtering of Large Numbers of Unstructured Text Documents by the Developed Tool TEA
| |
|
Filtering of Large Numbers of Unstructured Text Documents by the Developed Tool TEA
Jan Žižka3 and Aleš Bourek4 
| (3) |
Department of Biophysics, Faculty of Medicine, Masaryk University in Brno, Joštova 10, 662 43 Brno, Czech Republic |
| (4) |
Department of Information Technologies, Faculty of Informatics, Masaryk University in Brno, Botanická 68a, 602 00 Brno, Czech Republic |
Abstract
This paper describes a text-document-filtering software tool TEA (TExt Analyzer), which was originally developed for physicians
to support selections of large numbers of unstructured medical text documents obtained from available Internet services. TEA
learns interesting and relevant documents for individual users basically by the naïve Bayes algorithm. Moreover, TEA provides a number of additional functions that can improve its
classification accuracy, allow more specific document selection for individual users, and enable users to work with dictionaries
generated from analyzed documents. The learning process of TEA is based on a set of labeled positive and negative examples
of text documents, which obtain their labels from users interested in documents of certain, usually very specific topics.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|