Access the full text.
Sign up today, get DeepDyve free for 14 days.
Panagiotis Karras, Dimitris Sacharidis, N. Mamoulis (2007)
Exploiting duality in summarization with deterministic guarantees
Yunyue Zhu, D. Shasha (2002)
StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time
Darya Chudova, Padhraic Smyth (2002)
Pattern discovery in sequences under a Markov assumptionProceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A. Gionis, H. Mannila (2003)
Finding recurrent sources in sequences
Jiawei Han (2007)
IntroductionACM Trans. Knowl. Discov. Data, 1
(2009)
Received January
Jiong Yang, Wei Wang, Philip Yu, Jiawei Han (2002)
Mining long sequential patterns in a noisy environment
R. Bellman (1961)
On the approximation of curves by line segments using dynamic programmingCommun. ACM, 4
(2007)
c) Greedy-DP algorithm
J. Rissanen (1978)
Modeling By Shortest Data Description*Autom., 14
(1998)
S.,AND JAJODIA, S
H. Mannila, Hannu Toivonen, A. Verkamo (1997)
Discovery of Frequent Episodes in Event SequencesData Mining and Knowledge Discovery, 1
J. Kiernan, Evimaria Terzi (2009)
EventSummarizer: a tool for summarizing large event sequences
H. Mannila, Marko Salmenkivi (2001)
Finding simple intensity descriptions from event sequence data
Jiawei Han, Guozhu Dong, Yiwen Yin (1999)
Efficient mining of partial periodic patterns in time series databaseProceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)
R. Srikant, R. Agrawal (1996)
Mining Sequential Patterns: Generalizations and Performance Improvements
S. Guha, Nick Koudas, Kyuseok Shim (2001)
Data-streams and histograms
R. Agrawal, R. Srikant (1995)
Mining sequential patternsProceedings of the Eleventh International Conference on Data Engineering
H. Mannila, Hannu Toivonen (1996)
Discovering Generalized Episodes Using Minimal Occurrences
M. Koivisto, M. Perola, T. Varilo, W. Hennah, J. Ekelund, M. Lukk, L. Peltonen, E. Ukkonen, H. Mannila (2002)
An MDL Method for Finding Haplotype Blocks and for Estimating the Strength of Haplotype Block BoundariesPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
R. Swan, James Allan (2000)
Automatic generation of overview timelines
P. Kilpeläinen, H. Mannila, E. Ukkonen (1995)
MDL learning of unions of simple pattern languages from positive examples
Eamonn Keogh, Selina Chu, D. Hart, M. Pazzani (2001)
An online algorithm for segmenting time seriesProceedings 2001 IEEE International Conference on Data Mining
Yiming Yang, Tom Ault, Thomas Pierce, Charles Lattimer (2000)
Improving text categorization methods for event tracking
J. Kleinberg (2002)
Bursty and Hierarchical Structure in StreamsData Mining and Knowledge Discovery, 7
Wentian Li (2001)
DNA segmentation as a model selection process
James Allan, Rahul Gupta, V. Khandelwal (2001)
Temporal summaries of new topics
Manish Mehta, J. Rissanen, R. Agrawal (1995)
MDL-Based Decision Tree Pruning
J. Rissanen (1989)
Stochastic Complexity in Statistical Inquiry Theory
Wentian Li (2001)
New stopping criteria for segmenting DNA sequences.Physical review letters, 86 25
T. Brants, Francine Chen (2003)
A System for new event detectionProceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
(2009)
ACM Journal Name
J. Pei, Jiawei Han, Wei Wang (2007)
Constraint-based sequential pattern mining: the pattern-growth methodsJournal of Intelligent Information Systems, 28
J. Kiernan, Evimaria Terzi (2008)
Constructing comprehensive summaries of large event sequences
Mohamed Elfeky, Walid Aref, A. Elmagarmid (2004)
Using Convolution to Mine Obscure Periodic Patterns in One Pass
Jiawei Han, W. Gong, Yiwen Yin (1998)
Mining Segment-Wise Periodic Patterns in Time-Related Databases
Sheng Ma, J. Hellerstein (2001)
Mining partially periodic event patterns with unknown periodsProceedings 17th International Conference on Data Engineering
S. Papadimitriou, Philip Yu (2006)
Optimal multi-scale patterns in time series streamsProceedings of the 2006 ACM SIGMOD international conference on Management of data
Received Month Year; revised Month Year; accepted Month Year ACM Transactions on Computational Logic
L. Rabiner, B. Juang (1986)
An introduction to hidden Markov modelsIEEE ASSP Magazine, 3
W. Ruzzo, M. Tompa (1999)
A Linear Time Algorithm for Finding All Maximal Scoring SubsequencesProceedings. International Conference on Intelligent Systems for Molecular Biology
Evimaria Terzi, Panayiotis Tsaparas (2006)
Efficient Algorithms for Sequence Segmentation
Yasushi Sakurai, S. Papadimitriou, C. Faloutsos (2005)
BRAID: stream mining through group lag correlations
C. Bettini, X. Wang, S. Jajodia (1998)
Mining Temporal Relationships with Multiple Granularities in Time SequencesIEEE Data Eng. Bull., 21
Event sequences capture system and user activity over time. Prior research on sequence mining has mostly focused on discovering local patterns appearing in a sequence. While interesting, these patterns do not give a comprehensive summary of the entire event sequence. Moreover, the number of patterns discovered can be large. In this article, we take an alternative approach and build short summaries that describe an entire sequence, and discover local dependencies between event types. We formally define the summarization problem as an optimization problem that balances shortness of the summary with accuracy of the data description. We show that this problem can be solved optimally in polynomial time by using a combination of two dynamic-programming algorithms. We also explore more efficient greedy alternatives and demonstrate that they work well on large datasets. Experiments on both synthetic and real datasets illustrate that our algorithms are efficient and produce high-quality results, and reveal interesting local structures in the data.
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Nov 1, 2009
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.