With the proliferation of blogs, or weblogs, in the recent years, information in the blogosphere is becoming increasingly
difficult to access and retrieve. Previous studies have focused on analyzing personal blogs, but few have looked at corporate
blogs, the numbers of which are dramatically rising. In this paper, we use probabilistic techniques to detect keywords from
corporate blogs with respect to certain topics. We then demonstrate how this method can present the blogosphere in terms of
topics with measurable keywords, hence tracking popular conversations and topics in the blogosphere. By applying a probabilistic
approach, we can improve information retrieval in blog search and keywords detection, and provide an analytical foundation
for the future of corporate blog search and mining.
Keywords Weblog search - blog mining - probabilistic latent semantic analysis - corporate blog - business blog - web mining