Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries

Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data... Abstract Techniques based on randomized response enable the collection of potentially sensitive data from clients in a privacy-preserving manner with strong local differential privacy guarantees. A recent such technology, RAPPOR (12), enables estimation of the marginal frequencies of a set of strings via privacy-preserving crowdsourcing. However, this original estimation process relies on a known dictionary of possible strings; in practice, this dictionary can be extremely large and/or unknown. In this paper, we propose a novel decoding algorithm for the RAPPOR mechanism that enables the estimation of “unknown unknowns,” i.e., strings we do not know we should be estimating. To enable learning without explicit dictionary knowledge, we develop methodology for estimating the joint distribution of multiple variables collected with RAPPOR. Our contributions are not RAPPOR-specific, and can be generalized to other local differential privacy mechanisms for learning distributions of string-valued random variables. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Proceedings on Privacy Enhancing Technologies de Gruyter

Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries

Loading next page...
 
/lp/de-gruyter/building-a-rappor-with-the-unknown-privacy-preserving-learning-of-0ps0Gc0j0b
Publisher
de Gruyter
Copyright
Copyright © 2016 by the
ISSN
2299-0984
eISSN
2299-0984
DOI
10.1515/popets-2016-0015
Publisher site
See Article on Publisher Site

Abstract

Abstract Techniques based on randomized response enable the collection of potentially sensitive data from clients in a privacy-preserving manner with strong local differential privacy guarantees. A recent such technology, RAPPOR (12), enables estimation of the marginal frequencies of a set of strings via privacy-preserving crowdsourcing. However, this original estimation process relies on a known dictionary of possible strings; in practice, this dictionary can be extremely large and/or unknown. In this paper, we propose a novel decoding algorithm for the RAPPOR mechanism that enables the estimation of “unknown unknowns,” i.e., strings we do not know we should be estimating. To enable learning without explicit dictionary knowledge, we develop methodology for estimating the joint distribution of multiple variables collected with RAPPOR. Our contributions are not RAPPOR-specific, and can be generalized to other local differential privacy mechanisms for learning distributions of string-valued random variables.

Journal

Proceedings on Privacy Enhancing Technologiesde Gruyter

Published: Jul 1, 2016

References