Automatic or semi-automatic categorization of items (e.g. documents) into a taxonomy is an important and challenging machine-learning
task. In this paper, we present a module for semi-automatic categorization of video-recorded lectures. Properly categorized
lectures provide the user with a better browsing experience which makes her more efficient in accessing the desired content.
Our categorizer combines information found in texts associated with lectures and information extracted from various links
between lectures in a unified machine-learning framework. By taking not only texts but also the links into account, the classification
accuracy is increased by 12–20%.
Keywords categorization - classification - machine learning - multi-modal data mining - multimedia - video - VideoLectures.net