Students and Taxes: a Privacy-Preserving Study Using Secure Computation
We describe the use of secure multi-party computation for performing a large-scale privacypreserving statistical study on real government data. In 2015, statisticians from the Estonian Center of Applied Research (CentAR) conducted a big data study to look for correlations between working during university studies and failing to graduate in time. The study was conducted by linking the database of individual tax payments from the Estonian Tax and Customs Board and the database of higher education events from the Ministry of Education and Research. Data collection, preparation and analysis were conducted using the Sharemind secure multi-party computation system that provided end-to-end cryptographic protection to the analysis. Using ten million tax records and half a million education records in the analysis, this is the largest cryptographically private statistical study ever conducted on real data. Keywords: privacy, statistics, secure multi-party computation, case study DOI 10.1515/popets-2016-0019 Received 2015-11-30; revised 2016-03-01; accepted 2016-03-02. 1 Introduction Information and communication technology (ICT) is a growing industry where highly skilled specialists are in demand. This causes concern to both industry, where the wages keep rising, and the academia that cannot often match the pay grades offered by the industry. The Dan Bogdanov: Cybernetica, Estonia. E-mail: dan.bogdanov@cyber.ee *Corresponding Author: Liina Kamm: Cybernetica, Estonia. E-mail: liina.kamm@cyber.ee Baldur Kubo: Cybernetica, Estonia. E-mail: baldur.kubo@cyber.ee Reimo Rebane: Cybernetica, Estonia. E-mail: reimo.rebane@cyber.ee Ville Sokk: Cybernetica, Estonia. E-mail: ville.sokk@cyber.ee Riivo Talviste: Cybernetica, Estonia and University of Tartu, Estonia. E-mail: riivo.talviste@cyber.ee universities in Estonia formed a hypothesis that students who work during their studies, do not graduate in the allotted time. Moreover, many students quit before graduation, thus, not...