IR-929: (2013) Croft, W. B. and Dang, V., "Term Level Search Result Diversification," Proceedings of the 36th Annual ACM SIGIR Conference (SIGIR 2013), Dublin, Ireland, July 28-August 1, 2013, pp. 603-612. [View bibtex]

Abstract

Current approaches for search result diversification have been categorized as either implicit or explicit. The implicit approach assumes each document represents its own topic, and promotes diversity by selecting documents for different topics based on the difference of their vocabulary. On the other hand, the explicit approach models the set of query topics, or aspects. While the former approach is generally less effective, the latter usually depends on a manually created description of the query aspects, the automatic construction of which has proven difficult. This paper introduces a new approach: term-level diversification. Instead of modeling the set of query aspects, which are typically represented as coherent groups of terms, our approach uses terms without the grouping. Our results on the ClueWeb collection show that the grouping of topic terms provides very little benefit to diversification compared to simply using the terms themselves. Consequently, we demonstrate that term-level diversification, with topic terms identified automatically from the search results using a simple greedy algorithm, significantly outperforms methods that attempt to create a full topic structure for diversification.

Browse the full CIIR Publications Database