Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

An analysis of data corruption in the storage stack

An analysis of data corruption in the storage stack An Analysis of Data Corruption in the Storage Stack LAKSHMI N. BAIRAVASUNDARAM, ANDREA C. ARPACI-DUSSEAU, and REMZI H. ARPACI-DUSSEAU University of Wisconsin-Madison GARTH R. GOODSON NetApp and BIANCA SCHROEDER University of Toronto An important threat to reliable storage of data is silent data corruption. In order to develop suitable protection mechanisms against data corruption, it is essential to understand its characteristics. In this article, we present the rst large-scale study of data corruption. We analyze corruption instances recorded in production storage systems containing a total of 1.53 million disk drives, over a period of 41 months. We study three classes of corruption: checksum mismatches, identity discrepancies, and parity inconsistencies. We focus on checksum mismatches since they occur the most. We nd more than 400,000 instances of checksum mismatches over the 41-month period. We nd many interesting trends among these instances, including: (i) nearline disks (and their adapters) develop checksum mismatches an order of magnitude more often than enterprise-class disk drives, (ii) checksum mismatches within the same disk are not independent events and they show high spatial and temporal locality, and (iii) checksum mismatches across different disks in the same storage system are not independent. We use our observations http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Storage (TOS) Association for Computing Machinery

Loading next page...
 
/lp/association-for-computing-machinery/an-analysis-of-data-corruption-in-the-storage-stack-9ntZMiyOGp

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2008 by ACM Inc.
ISSN
1553-3077
DOI
10.1145/1416944.1416947
Publisher site
See Article on Publisher Site

Abstract

An Analysis of Data Corruption in the Storage Stack LAKSHMI N. BAIRAVASUNDARAM, ANDREA C. ARPACI-DUSSEAU, and REMZI H. ARPACI-DUSSEAU University of Wisconsin-Madison GARTH R. GOODSON NetApp and BIANCA SCHROEDER University of Toronto An important threat to reliable storage of data is silent data corruption. In order to develop suitable protection mechanisms against data corruption, it is essential to understand its characteristics. In this article, we present the rst large-scale study of data corruption. We analyze corruption instances recorded in production storage systems containing a total of 1.53 million disk drives, over a period of 41 months. We study three classes of corruption: checksum mismatches, identity discrepancies, and parity inconsistencies. We focus on checksum mismatches since they occur the most. We nd more than 400,000 instances of checksum mismatches over the 41-month period. We nd many interesting trends among these instances, including: (i) nearline disks (and their adapters) develop checksum mismatches an order of magnitude more often than enterprise-class disk drives, (ii) checksum mismatches within the same disk are not independent events and they show high spatial and temporal locality, and (iii) checksum mismatches across different disks in the same storage system are not independent. We use our observations

Journal

ACM Transactions on Storage (TOS)Association for Computing Machinery

Published: Nov 1, 2008

There are no references for this article.