Discovering General Prominent Streaks in Sequence Data

Gensheng Zhang; Xiao Jiang; Ping Luo; Min Wang; Chengkai Li

doi:10.1145/2601439

Loading next page...

References (40)

Clu-istos Foutsos, M. llanganatan, Yanais Maaolopoulo (1994)
Fast Subsequence Matching in Time-Series Databases
A. Moore, Weng-Keen Wong (2004)
Data mining for early disease outbreak detection
Jiawei Han (2007)
Introduction
ACM Trans. Knowl. Discov. Data, 1
T. Liao (2005)
Clustering of time series data - a survey
Pattern Recognit., 38
Tian Xia, Donghui Zhang (2006)
Refreshing the sky: the compressed skycube with efficient support for frequent updates
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
K. Tan, P. Eng, B. Ooi (2001)
Efficient Progressive Skyline Computation
(1997)
Data cube: A relational aggregation operator generalizing groupby , cross-tab, and sub-totals
C. Chan, H. Jagadish, K. Tan, A. Tung, Zhenjie Zhang (2006)
On High Dimensional Skylines
Young-In Shin, D. Fussell (2007)
Parametric Kernels for Sequence Data Analysis
R. Agrawal, C. Faloutsos, A. Swami (1993)
Efficient Similarity Search In Sequence Databases
H. Kung, F. Luccio, F. Preparata (1975)
On Finding the Maxima of a Set of Vectors
J. ACM, 22
Padhraic Smyth (1996)
Clustering Sequences with Hidden Markov Models
Min Wang, X. Wang (2006)
Finding the Plateau in an Aggregated Time Series
Mohammed Zaki (2004)
SPADE: An Efficient Algorithm for Mining Frequent Sequences
Machine Learning, 42
J. Pei, Yidong Yuan, Xuemin Lin, Wen Jin, M. Ester, Qing Liu, Wei Wang, Yufei Tao, J. Yu, Qing Zhang (2006)
Towards multidimensional subspace skyline analysis
ACM Trans. Database Syst., 31
R. Srikant, R. Agrawal (1996)
Mining Sequential Patterns: Generalizations and Performance Improvements
Byoung-Kee Yi, H. Jagadish, C. Faloutsos (1998)
Efficient retrieval of similar time sequences under time warping
Proceedings 14th International Conference on Data Engineering
Xuemin Lin, Yidong Yuan, Qing Zhang, Ying Zhang (2007)
Selecting Stars: The k Most Representative Skyline Operator
2007 IEEE 23rd International Conference on Data Engineering
R. Agrawal, R. Srikant (1995)
Mining sequential patterns
Proceedings of the Eleventh International Conference on Data Engineering
(2013)
Received February
Yufei Tao, Ling Ding, Xuemin Lin, J. Pei (2009)
Distance-Based Representative Skyline
2009 IEEE 25th International Conference on Data Engineering
R. Agrawal, King-Ip Lin, H. Sawhney, Kyuseok Shim (1995)
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases
S. Altschul, W. Gish, W. Miller, E. Myers, D. Lipman (1990)
Basic local alignment search tool.
Journal of molecular biology, 215 3
Zhenjie Zhang, Xinyu Guo, Hua Lu, A. Tung, Nan Wang (2005)
Discovering strong skyline points in high dimensional spaces
D. Papadias, Yufei Tao, Greg Fu, B. Seeger (2005)
Progressive skyline computation in database systems
ACM Trans. Database Syst., 30
Donald Kossmann, Frank Ramsak, S. Rost (2002)
Shooting Stars in the Sky: An Online Algorithm for Skyline Queries
S. Börzsönyi, Donald Kossmann, K. Stocker (2001)
The Skyline operator
Proceedings 17th International Conference on Data Engineering
J. Pei, Jiawei Han, B. Mortazavi-Asl, Jianyong Wang, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu (2004)
Mining sequential patterns by pattern-growth: the PrefixSpan approach
IEEE Transactions on Knowledge and Data Engineering, 16
L. Rabiner (1989)
A tutorial on hidden Markov models and selected applications in speech recognition
Proc. IEEE, 77
Xifeng Yan, Jiawei Han, R. Afshar (2003)
CloSpan: Mining Closed Sequential Patterns in Large Datasets
Xiao Jiang, Chengkai Li, Ping Luo, Min Wang, Yong Yu (2011)
Prominent streak discovery in sequence data
Yufei Tao, Xiaokui Xiao, J. Pei (2006)
SUBSKY: Efficient Computation of Skylines in Subspaces
22nd International Conference on Data Engineering (ICDE'06)
J. Chomicki, P. Godfrey, Jarek Gryz, Dongmin Liang (2003)
Skyline with presorting
Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405)
Article A, Publication date: January YYYY. Discovering General Prominent Streaks in Sequence Data A
Sarah Cohen, Chengkai Li, Jun Yang, Cong Yu (2011)
Computational Journalism: A Call to Arms to Database Researchers
WarpingTim Oates, Laura Firoiu, Paul CohenComputer (1999)
Clustering Time Series with Hidden Markov Models and Dynamic Time Warping
B. Jiang, J. Pei (2009)
Online Interval Skyline Queries on Time Series
2009 IEEE 25th International Conference on Data Engineering
J. Bentley (1979)
Multidimensional Binary Search Trees in Database Applications
IEEE Transactions on Software Engineering, SE-5
J. Bentley (1975)
Multidimensional binary search trees used for associative searching
Commun. ACM, 18
Miao Jiang, M. Munawar, Thomas Reidemeister, Paul Ward (2011)
Efficient Fault Detection and Diagnosis in Complex Software Systems with Information-Theoretic Monitoring
IEEE Transactions on Dependable and Secure Computing, 8

Publisher: Association for Computing Machinery
Copyright: Copyright © 2014 by ACM Inc.
ISSN: 1556-4681
DOI: 10.1145/2601439
Publisher site: See Article on Publisher Site

Abstract

Discovering General Prominent Streaks in Sequence Data GENSHENG ZHANG, The University of Texas at Arlington XIAO JIANG, Shanghai Jiao Tong University PING LUO, HP Labs China MIN WANG, Google Research CHENGKAI LI, The University of Texas at Arlington This article studies the problem of prominent streak discovery in sequence data. Given a sequence of values, a prominent streak is a long consecutive subsequence consisting of only large (small) values, such as consecutive games of outstanding performance in sports, consecutive hours of heavy network traffic, and consecutive days of frequent mentioning of a person in social media. Prominent streak discovery provides insightful data patterns for data analysis in many real-world applications and is an enabling technique for computational journalism. Given its real-world usefulness and complexity, the research on prominent streaks in sequence data opens a spectrum of challenging problems. A baseline approach to finding prominent streaks is a quadratic algorithm that exhaustively enumerates all possible streaks and performs pairwise streak dominance comparison. For more efficient methods, we make the observation that prominent streaks are in fact skyline points in two dimensions--streak interval length and minimum value in the interval. Our solution thus hinges on the idea to separate the

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery

Published: Jun 1, 2014

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Discovering General Prominent Streaks in Sequence Data

Discovering General Prominent Streaks in Sequence Data

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Discovering General Prominent Streaks in Sequence Data

Discovering General Prominent Streaks in Sequence Data

References (40)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies