IR-953: (2013) Dori-Hacohen, S. and Allan, J., "Detecting Controversy on the Web," Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), San Francisco, CA, Oct. 27-Nov. 1, 2013, pp. 1845-1848. [View bibtex]

Abstract

A useful feature to facilitate critical literacy would alert users when they are reading a controversial web page. This requires solving a binary classification problem: does a given web page discuss a controversial topic? We explore the feasibility of solving the problem by treating it as supervised k-nearest-neighbor classification. Our approach (1) maps a webpage to a set of neighboring Wikipedia articles which were labeled on a controversiality metric; (2) coalesces those labels into an estimate of the webpage’s controversiality; and finally (3) converts the estimate to a binary value using a threshold. We demonstrate the applicability of our approach by validating it on a set of webpages drawn from seed queries. We show absolute gains of 22% in F0.5 on our test set over a sentiment-based approach, highlighting that detecting controversy is more complex than simply detecting opinions.

Browse the full CIIR Publications Database