Data mining is typically applied to large databases of highly structured information in order to discover new knowledge. In
businesses and institutions, the amount of information existing in repositories of text documents usually rivals or surpasses
the amount found in relational databases. Though the amount of potentially valuable knowledge contained in document collections
can be great, they are often dificult to analyze. Therefore, it is important to develop methods to efficiently discover knowledge
embedded in these document repositories. In this paper we describe an approach for mining knowledge from text collections
by applying data mining techniques to metadata records generated via automated text categorization. By controlling the set
of metadata fields as well as the set of assigned categories we can customize the knowledge discovery task to address specific
questions. As an example, we apply the approach to a large collection of product reviews and evaluate the performance of the
knowledge discovery.