Longitudinal Analytics
of Web Archive Data


Crowdsourced Entity Markup

The paper "Crowdsourced Entity Markup" by Lili Jiang, Yafang Wang, Johannes Hoffart and Gerhard Weikum has been accepted for the Workshop on Crowdsourcing the Semantic Web (CrowdSem 2013) in conjunction with ISWC 2013.

Entities, such as people, places, products, etc., exist in knowledge bases and linked data, on one hand, and in web pages, news articles, and social media, on the other hand. Entity markup, like Named Entities Recognition and Disambiguation (NERD), is the essential means for adding semantic value to unstructured web contents and this way enabling the linkage between unstructured and structured data and knowledge collections. A major challenge in this endeavor lies in the dynamics of the digital contents about the world, with new entities emerging all the time. In this paper, we propose a crowdsourced framework for NERD, specifically addressing the challenge of emerging entities in social media. Our approach combines NERD techniques with the detection of entity alias names and with co-reference resolution in texts. We propose a linking-game based crowdsourcing system for this combined task, and we report on experimental insights with this approach and on lessons learned.

CrowdSem 2013 homepage