Access the full text.
Sign up today, get DeepDyve free for 14 days.
A. McGuire, Rebecca Fisher, Paul Cusenza, K. Hudson, M. Rothstein, D. McGraw, S. Matteson, J. Glaser, D. Henley (2008)
Confidentiality, privacy, and security of genetic and genomic test information in electronic health records: points to considerGenetics in Medicine, 10
Arvind Narayanan, Vitaly Shmatikov (2008)
Robust De-anonymization of Large Sparse Datasets2008 IEEE Symposium on Security and Privacy (sp 2008)
Hakan Hacıgümüş, B. Iyer, Chen Li, S. Mehrotra (2002)
Executing SQL over encrypted data in the database-service-provider model
Murat Kantarcioglu, Wei Jiang, B. Malin (2008)
A Privacy-Preserving Framework for Integrating Person-Specific Databases
L. Sweeney (2002)
k-Anonymity: A Model for Protecting PrivacyInt. J. Uncertain. Fuzziness Knowl. Based Syst., 10
S. Vinterbo, L. Ohno-Machado, S. Dreiseitl (2001)
Hiding information by cell suppressionProceedings. AMIA Symposium
Murat Kantarcioglu, Wei Jiang, Y. Liu, B. Malin (2008)
A Cryptographic Approach to Securely Share and Query Genomic SequencesIEEE Transactions on Information Technology in Biomedicine, 12
B. Malin (2007)
A computational model to protect patient data from location-based re-identificationArtificial intelligence in medicine, 40 3
M. Rothstein, P. Epps (2001)
Ethical and legal implications of pharmacogenomicsNature Reviews Genetics, 2
Bradley Malin (2005)
Protecting Genomic Sequence Anonymity with Generalization LatticesMethods of Information in Medicine, 44
(2003)
of Health and Human Services, Office for Civil Rights
M. Mailman, M. Feolo, Y. Jin, Masato Kimura, K. Tryka, Rinat Bagoutdinov, Luning Hao, A. Kiang, J. Paschall, Lon Phan, N. Popova, Stephanie Pretel, Lora Ziyabari, Moira Lee, Yu Shao, Zhen Wang, K. Sirotkin, Minghong Ward, Michael Kholodov, Kerry Zbicz, J. Beck, Michael Kimelman, S. Shevelev, Don Preuss, E. Yaschenko, Alan Graeff, J. Ostell, S. Sherry (2007)
The NCBI dbGaP database of genotypes and phenotypesNature Genetics, 39
(2003)
US Department of Health and Human Services, Office for Civil Rights. Standards for protection of electronic health information; final rule
D. Gurwitz, J. Lunshof, R. Altman (2006)
A call for the creation of personalized medicine databasesNature Reviews Drug Discovery, 5
A. Meyerson, Ryan Williams (2004)
On the complexity of optimal K-anonymity
B. Malin, L. Sweeney (2004)
How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systemsJournal of biomedical informatics, 37 3
T. Eguale, G. Bartlett, R. Tamblyn (2005)
Rare Visible disorders/Diseases as Individually Identifiable Health InformationAMIA ... Annual Symposium proceedings. AMIA Symposium
D. Roden, J. Pulley, M. Basford, GR Bernard, EW Clayton, JR Balser, D. Masys (2008)
Development of a Large‐Scale De‐Identified DNA Biobank to Enable Personalized MedicineClinical Pharmacology & Therapeutics, 84
(2007)
Policy for sharing of data obtained in NIH supported or conducted genome-wide association studies. NOT-OD-07-088
Yehuda Lindell, Benny Pinkas (2000)
Privacy Preserving Data MiningJournal of Cryptology, 15
Zhen Lin, M. Hewett, R. Altman (2002)
Using binning to maintain confidentiality of medical dataProceedings. AMIA Symposium
R. Velde, Ph.D. M.D. (2003)
Clinical Information Systems
Zhen Lin, A. Owen, R. Altman (2004)
Genomic Research and Human Subject PrivacyScience, 305
P. Samarati (2001)
Protecting Respondents' Identities in Microdata ReleaseIEEE Trans. Knowl. Data Eng., 13
V. Barbour (2003)
UK Biobank: a project in search of a protocol?The Lancet, 361
P. Elkin (2004)
Clinical Information Systems: A Component-Based Approach, 79
AbstractObjective De-identified clinical data in standardized form (eg, diagnosis codes), derived from electronic medical records, are increasingly combined with research data (eg, DNA sequences) and disseminated to enable scientific investigations. This study examines whether released data can be linked with identified clinical records that are accessible via various resources to jeopardize patients' anonymity, and the ability of popular privacy protection methodologies to prevent such an attack.Design The study experimentally evaluates the re-identification risk of a de-identified sample of Vanderbilt's patient records involved in a genome-wide association study. It also measures the level of protection from re-identification, and data utility, provided by suppression and generalization.Measurement Privacy protection is quantified using the probability of re-identifying a patient in a larger population through diagnosis codes. Data utility is measured at a dataset level, using the percentage of retained information, as well as its description, and at a patient level, using two metrics based on the difference between the distribution of Internal Classification of Disease (ICD) version 9 codes before and after applying privacy protection.Results More than 96% of 2800 patients' records are shown to be uniquely identified by their diagnosis codes with respect to a population of 1.2 million patients. Generalization is shown to reduce further the percentage of de-identified records by less than 2%, and over 99% of the three-digit ICD-9 codes need to be suppressed to prevent re-identification.Conclusions Popular privacy protection methods are inadequate to deliver a sufficiently protected and useful result when sharing data derived from complex clinical systems. The development of alternative privacy protection models is thus required.
Journal of the American Medical Informatics Association – Oxford University Press
Published: May 1, 2010
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.