Transaction logs from online search engines are valuable for two reasons: First, they provide insight into human information-seeking
behavior. Second, log data can be used to train user models, which can then be applied to improve retrieval systems. This
article presents a study of logs from PubMed
®, the public gateway to the MEDLINE
® database of bibliographic records from the medical and biomedical primary literature. Unlike most previous studies on general
Web search, our work examines user activities with a highly-specialized search engine. We encode user actions as string sequences
and model these sequences using
n-gram language models. The models are evaluated in terms of perplexity and in a sequence prediction task. They help us better
understand how PubMed users search for information and provide an enabler for improving users’ search experience.
Keywords Search behavior - Query log analysis