Probabilistic data fusion on a large document collection

David Lillis; Fergus Toolan; Rem Collier; John Dunnion

doi:10.1007/s10462-007-9037-2

Loading next page...

References (22)

Jong-Hak Lee (1997)
Analyses of multiple evidence combination
D. Harman (1993)
Overview of the first TREC conference
E. Selberg, Oren Etzioni (1997)
The MetaCrawler architecture for resource aggregation on the Web
IEEE Intelligent Systems, 12
E. Voorhees, N. Gupta, Ben Johnson-Laird (1995)
Learning collection fusion strategies
(1997)
Analyses of multiple evidence combination. SIGIR forum
E. Fox, Joseph Shaw (1993)
Combination of Multiple Searches
R. Manmatha, T. Rath, F. Feng (2001)
Modeling score distributions for combining the outputs of search engines
J. Aslam, Mark Montague (2000)
Bayes optimal metasearch: a probabilistic model for combining the results of multiple retrieval systems (poster session)
C. Vogt, G. Cottrell (1999)
Fusion Via a Linear Combination of Scores
Information Retrieval, 1
David Lillis, F. Toolan, Rem Collier, J. Dunnion (2006)
ProbFuse: a probabilistic approach to data fusion
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
L. Gravano, K. Chang, H. Garcia-Molina, A. Paepcke (1997)
STARTS: Stanford Protocol Proposal for Internet Retrieval and Search
Nick Craswell, D. Hawking, P. Thistlewaite (1999)
Merging Results From Isolated Search Engines
G. Salton, H. Schneider (1982)
Research and Development in Information Retrieval
, 146
L. Gravano, Chen-Chuan Chang, H. Garcia-Molina, A. Paepcke (1997)
STARTS: Stanford proposal for Internet meta-searching
, 26
E. Voorhees, N. Gupta, Ben Johnson-Laird (1994)
The Collection Fusion Problem
D. Harman (1992)
Overview of the First Text REtrieval Conference (TREC-1)
J. Aslam, Mark Montague (2001)
Models for metasearch
David Lillis, F. Toolan, Angel Mur, Liu Peng, Rem Collier, J. Dunnion (2006)
Probability-based fusion of information retrieval result sets
Artificial Intelligence Review, 25
S. Lawrence, C. Giles (1998)
Inquirus, the NECI Meta Search Engine
Comput. Networks, 30
Craig Silverstein, M. Henzinger (1998)
Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14
Mark Montague, J. Aslam (2001)
Relevance score normalization for metasearch
C. Buckley, E. Voorhees (2004)
Retrieval evaluation with incomplete information

Publisher: Springer Journals
Copyright: Copyright © 2007 by Springer Science+Business Media B.V.
Subject: Computer Science; Complexity; Computer Science, general ; Artificial Intelligence (incl. Robotics)
ISSN: 0269-2821
eISSN: 1573-7462
DOI: 10.1007/s10462-007-9037-2
Publisher site: See Article on Publisher Site

Abstract

Data fusion is the process of combining the output of a number of Information Retrieval (IR) algorithms into a single result set, to achieve greater retrieval performance. ProbFuse is a data fusion algorithm that uses the history of the underlying IR algorithms to estimate the probability that subsequent result sets include relevant documents in particular positions. It has been shown to out-perform CombMNZ, the standard data fusion algorithm against which to compare performance, in a number of previous experiments. This paper builds upon this previous work and applies probFuse to the much larger Web Track document collection from the 2004 Text REtreival Conference. The performance of probFuse is compared against that of CombMNZ using a number of evaluation measures and is shown to achieve substantial performance improvements.

Journal

Artificial Intelligence Review – Springer Journals

Published: Sep 14, 2007

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Probabilistic data fusion on a large document collection

Probabilistic data fusion on a large document collection

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Probabilistic data fusion on a large document collection

Probabilistic data fusion on a large document collection

References (22)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies