Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Incorporating Economic Conditions in Synthetic Microdata for Business Programs

Incorporating Economic Conditions in Synthetic Microdata for Business Programs Many agencies are currently investigating whether releasing synthetic microdata could be a viable dissemination strategy for highly sensitive data, such as business data, for which disclosure avoidance regulations would otherwise prohibit the release of public use microdata. The U.S. Census Bureau has identified the Economic Census as a candidate program and has been developing synthetic data generators. The synthetic data should account for skewed and irregular distributions, satisfy predetermined edit constraints, and preserve selected privacy features. Previous research on these generators was confined to businesses that were in operation for the full year, ignoring the special features of births and deaths in the models. These generators preserve multivariate relationships and yield marginal totals that closely correspond to the published official statistics. However, these synthetic data consequently do not reflect the state of economic expansion or contraction. This missing information is a severe deficiency for the targeted data users comprising economists, policymakers, and methodologists, especially since the global pandemic of 2020. This paper introduces an approach that addresses this deficiency, producing partially synthetic data with high utility and privacy protection. We provide preliminary results using selected industry data from the 2012 Economic Census. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Survey Statistics and Methodology Oxford University Press

Incorporating Economic Conditions in Synthetic Microdata for Business Programs

Loading next page...
 
/lp/oxford-university-press/incorporating-economic-conditions-in-synthetic-microdata-for-business-Hyre1XJNns

References (1)

Publisher
Oxford University Press
Copyright
Published by Oxford University Press on behalf of the American Association for Public Opinion Research 2022. This work is written by US Government employees and is in the public domain in the US.
ISSN
2325-0984
eISSN
2325-0992
DOI
10.1093/jssam/smab054
Publisher site
See Article on Publisher Site

Abstract

Many agencies are currently investigating whether releasing synthetic microdata could be a viable dissemination strategy for highly sensitive data, such as business data, for which disclosure avoidance regulations would otherwise prohibit the release of public use microdata. The U.S. Census Bureau has identified the Economic Census as a candidate program and has been developing synthetic data generators. The synthetic data should account for skewed and irregular distributions, satisfy predetermined edit constraints, and preserve selected privacy features. Previous research on these generators was confined to businesses that were in operation for the full year, ignoring the special features of births and deaths in the models. These generators preserve multivariate relationships and yield marginal totals that closely correspond to the published official statistics. However, these synthetic data consequently do not reflect the state of economic expansion or contraction. This missing information is a severe deficiency for the targeted data users comprising economists, policymakers, and methodologists, especially since the global pandemic of 2020. This paper introduces an approach that addresses this deficiency, producing partially synthetic data with high utility and privacy protection. We provide preliminary results using selected industry data from the 2012 Economic Census.

Journal

Journal of Survey Statistics and MethodologyOxford University Press

Published: Apr 9, 2022

Keywords: DP normal mixture model; Economic data; Multivariate; Synthetic data

There are no references for this article.