Approximate distributed top-k queries

Boaz Patt-Shamir; Allon Shafrir

doi:10.1007/s00446-008-0055-3

Loading next page...

References (28)

P. Bak (1996)
How Nature Works
Graham Cormode, M. Garofalakis, S. Muthukrishnan, R. Rastogi (2005)
Holistic aggregates in a networked world: distributed tracking of approximate quantiles
P. Bak (1997)
How Nature Works: The Science of Self-Organized Criticality
American Journal of Physics, 65
B. Warneke (2004)
Miniaturizing Sensor Networks with MEMS
Nicolas Bruno, L. Gravano, A. Marian (2002)
Evaluating top-k queries over Web-accessible databases
Proceedings 18th International Conference on Data Engineering
A. Yao (1981)
Should tables be sorted?
19th Annual Symposium on Foundations of Computer Science (sfcs 1978)
Ronald Fagin, A. Lotem, M. Naor (2001)
Optimal aggregation algorithms for middleware
J. Comput. Syst. Sci., 66
B. Patt-Shamir (2004)
A note on efficient aggregate queries in sensor networks
Theor. Comput. Sci., 370
Marianne Durand, P. Flajolet (2003)
Loglog Counting of Large Cardinalities (Extended Abstract)
P. Dagum, R.M. Karp, M. Luby, S. Ross (2000)
An optimal algorithm for Monte Carlo estimation
SIAM J. Comput., 29
Wolf-Tilo Balke, W. Nejdl, W. Siberski, U. Thaden (2005)
Progressive distributed top-k retrieval in peer-to-peer networks
21st International Conference on Data Engineering (ICDE'05)
D. Zeinalipour-Yazti, Zografoula Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, Nick Koudas, D. Srivastava (2005)
The threshold join algorithm for top-k queries in distributed sensor networks
Ronald Fagin (2002)
Combining fuzzy information: an overview
SIGMOD Rec., 31
S. Michel, P. Triantafillou, G. Weikum (2005)
KLEE: A Framework for Distributed Top-k Query Algorithms
(1981)
Should tables be sorted? J
P. Dagum, R. Karp, M. Luby, S. Ross (1995)
An optimal algorithm for Monte Carlo estimation
Proceedings of IEEE 36th Annual Foundations of Computer Science
M. Greenwald, S. Khanna (2004)
Power-conserving computation of order-statistics over sensor networks
S. Madden, M. Franklin, J. Hellerstein, W. Hong (2003)
The design of an acquisitional query processor for sensor networks
Adam Silberstein, R. Braynard, C. Ellis, Kamesh Munagala, Jun Yang (2006)
A Sampling-Based Approach to Optimizing Top-k Queries in Sensor Networks
22nd International Conference on Data Engineering (ICDE'06)
Suman Nath, Phillip Gibbons, S. Seshan, Zachary Anderson (2004)
Synopsis diffusion for robust aggregation in sensor networks
P. Cao, Zhe Wang (2004)
Efficient top-K query calculation in distributed networks
M. Faloutsos, P. Faloutsos, C. Faloutsos (1999)
On power-law relationships of the Internet topology
Yong Yao, J. Gehrke (2002)
The cougar approach to in-network query processing in sensor networks
SIGMOD Rec., 31
Seif Haridi (1992)
Distributed Algorithms
, 647
M. Fredman, M. Saks (1989)
The cell probe complexity of dynamic data structures
Brian Babcock, Christopher Olston (2003)
Distributed top-k monitoring
Jeffrey Considine, Feifei Li, G. Kollios, J. Byers (2004)
Approximate aggregation techniques for sensor databases
Proceedings. 20th International Conference on Data Engineering
M. Ilyas, I. Mahgoub, Laurie Kelly (2004)
Handbook of Sensor Networks: Compact Wireless and Wired Sensing Systems

Publisher: Springer Journals
Copyright: Copyright © 2008 by Springer-Verlag
Subject: Computer Science; Theory of Computation ; Software Engineering/Programming and Operating Systems ; Computer Systems Organization and Communication Networks; Computer Hardware ; Computer Communication Networks
ISSN: 0178-2770
eISSN: 1432-0452
DOI: 10.1007/s00446-008-0055-3
Publisher site: See Article on Publisher Site

Abstract

We consider a distributed system where each node keeps a local count for items (similar to elections where nodes are ballot boxes and items are candidates). A top-k query in such a system asks which are the k items whose global count, across all nodes in the system, is the largest. In this paper, we present a Monte Carlo algorithm that outputs, with high probability, a set of k candidates which approximates the top-k items. The algorithm is motivated by sensor networks in that it focuses on reducing the individual communication complexity. In contrast to previous algorithms, the communication complexity depends only on the global scores and not on the partition of scores among nodes. If the number of nodes is large, our algorithm dramatically reduces the communication complexity when compared with deterministic algorithms. We show that the complexity of our algorithm is close to a lower bound on the cell-probe complexity of any non-interactive top-k approximation algorithm. We show that for some natural global distributions (such as the Geometric or Zipf distributions), our algorithm needs only polylogarithmic number of communication bits per node.

Journal

Distributed Computing – Springer Journals

Published: Mar 19, 2008

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Approximate distributed top-k queries

Approximate distributed top-k queries

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Approximate distributed top-k queries

Approximate distributed top-k queries

References (28)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies