Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Statistical inference in massive data sets

Statistical inference in massive data sets Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient if the entire data set was analyzed simultaneously. Asymptotic properties of the resulting estimate are studied, and the asymptotic normality of the resulting estimator is established. The standard error formula for the resulting estimate is proposed and empirically tested; thus, statistical inference for parameters of interest can be performed. The effectiveness of the proposed approach is illustrated using simulation studies and an Internet traffic data example. Copyright © 2012 John Wiley & Sons, Ltd. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Applied Stochastic Models in Business and Industry Wiley

Loading next page...
 
/lp/wiley/statistical-inference-in-massive-data-sets-cPrlu0O8fb

References (15)

Publisher
Wiley
Copyright
Copyright © 2013 John Wiley & Sons, Ltd.
ISSN
1524-1904
eISSN
1526-4025
DOI
10.1002/asmb.1927
Publisher site
See Article on Publisher Site

Abstract

Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient if the entire data set was analyzed simultaneously. Asymptotic properties of the resulting estimate are studied, and the asymptotic normality of the resulting estimator is established. The standard error formula for the resulting estimate is proposed and empirically tested; thus, statistical inference for parameters of interest can be performed. The effectiveness of the proposed approach is illustrated using simulation studies and an Internet traffic data example. Copyright © 2012 John Wiley & Sons, Ltd.

Journal

Applied Stochastic Models in Business and IndustryWiley

Published: Sep 1, 2013

There are no references for this article.