Lecture Notes in Computer Science, 2007, Volume 4519/2007, 503-517, DOI: 10.1007/978-3-540-72667-8_36

What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content

Sören Auer and Jens Lehmann

View Related Documents

Abstract

Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.

Fulltext Preview

Image of the first page of the fulltext document