publication . Article . Preprint . Other literature type . 2020

Applying data synthesis for longitudinal business data across three countries

M. Jahangir Alam; Benoit Dostie; J "org Drechsler; Lars Vilhuber;
Open Access English
  • Published: 05 May 2020
  • Publisher: New York: Exeley
Abstract
Data on businesses collected by statistical agencies are challenging to protect.Many businesses have unique characteristics, and distributions of employment,sales, and profits are highly skewed. Attackers wishing to conduct identificationattacks often have access to much more information than for any individual. Asa consequence, most disclosure avoidance mechanisms fail to strike an accept-able balance between usefulness and confidentiality protection. Detailed aggregatestatistics by geography or detailed industry classes are rare, public-use microdataon businesses are virtually inexistant, and access to confidential microdata can beburdensome. Synthetic microda...
Subjects
free text keywords: business data, confidentiality, LBD, LEAP, BHP, synthetic, business data, confidentiality, LBD, LEAP, BHP, synthetic, synthetic data, synthetic data, Statistics, Probability and Uncertainty, Statistics and Probability, Economics - Econometrics, Statistics & Probability, ddc:510, lcsh:Statistics, lcsh:HA1-4737, Data science, Business data, Synthetic data, Publication, business.industry, business, Profit (economics), Data synthesis, Confidentiality, Microdata (HTML), Data set
Funded by
SSHRC
Project
  • Funder: Social Sciences and Humanities Research Council (SSHRC)
,
NSF| NCRN-MN: Cornell Census-NSF Research Node: Integrated Research Support, Training and Data Documentation
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1131848
  • Funding stream: Directorate for Social, Behavioral & Economic Sciences | Division of Social and Economic Sciences
,
NSF| ITR-(ECS+ASE)-(dmc+int): Info Tech Challenges for Secure Access to Confidential Social Science Data
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 0427889
  • Funding stream: Directorate for Social, Behavioral & Economic Sciences | Division of Social and Economic Sciences
,
NSF| Synthetic Data User Testing and Dissemination
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1042181
  • Funding stream: Directorate for Social, Behavioral & Economic Sciences | Division of Social and Economic Sciences
25 references, page 1 of 2

ABOWD, J. M. and J. I. LANE (2004). “New Approaches to Confidentiality Protection Synthetic Data, Remote Access and Research Data Centers”. In: Privacy in Statistical Databases. Ed. by J. DOMINGO-FERRER and V. TORRA. Vol. 3050. Lecture Notes in Computer Science. Springer, pp. 282-289. DOI: 10.1007/978-3-540- 22118-0. URL: http://www.springer.com/la/book/9783540221180.

ABOWD, J. M. and I. SCHMUTTE (2015). “Economic analysis and statistical disclosure limitation”. In: Brookings Papers on Economic Activity Fall 2015. URL: http: //www.brookings.edu/about/projects/bpea/papers/2015/economicanalysis-statistical-disclosure-limitation.

ABOWD, J. M., B. E. STEPHENS, L. VILHUBER, F. ANDERSSON, K. L. MCKINNEY, M. ROEMER, and S. D. WOODCOCK (2009). “The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators”. In: Producer Dynamics: New Evidence from Micro Data. Ed. by T. DUNNE, J. B. JENSEN, and M. J. ROBERTS. University of Chicago Press. URL: http://www.nber.org/ chapters/c0485.

ABOWD, J. M. and L. VILHUBER (2010). VirtualRDC - Synthetic Data Server. Cornell University, Labor Dynamics Institute. URL: http://www.vrdc.cornell. edu/sds/.

ALAM, M. J., B. DOSTIE, J. DRECHSLER, and L. VILHUBER (2020). Replication archive for: Applying Data Synthesis for Longitudinal Business Data across Three Countries. Code and data. Zenodo. DOI: 10.5281/zenodo.3785744.

ARELLANO, M. and S. BOND (1991). “Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations”. In: Review of Economic Studies 58.2, pp. 277-297. URL: https : / / EconPapers . repec . org/RePEc:oup:restud:v:58:y:1991:i:2:p:277-297..

ARELLANO, M. and O. BOVER (1995). “Another look at the instrumental variable estimation of error-components models”. In: Journal of Econometrics 68.1, pp. 29- 51. URL: https://EconPapers.repec.org/RePEc:eee:econom:v:68:y: 1995:i:1:p:29-51.

BARTELSMAN, E., J. HALTIWANGER, and S. SCARPETTA (2009). “Measuring and Analyzing Cross-country Differences in Firm Dynamics”. In: DUNNE, T., J. B. JENSEN, and M. J. ROBERTS. Producer Dynamics: New Evidence from Micro Data. University of Chicago Press, pp. 15-76. URL: http://www.nber.org/ chapters/c0480.

BENDER, S. (2009). “The RDC of the Federal Employment Agency as a part of the German RDC Movement”. In: Comparative Analysis of Enterprise Data, 2009 Conference. Comparative Analysis of Enterprise Data, 2009 Conference. (Tokyo). URL: http : / / gcoe . ier . hit - u . ac . jp / CAED / index . html (visited on 05/05/2014).

BENEDETTO, G., J. HALTIWANGER, J. LANE, and K. MCKINNEY (2007). “Using Worker Flows in the Analysis of the Firm”. In: Journal of Business and Economic Statistics 25.3, pp. 299-313.

BLUNDELL, R. and S. BOND (1998). “Initial conditions and moment restrictions in dynamic panel data models”. In: Journal of Econometrics 87.1, pp. 115-143. URL: https://ideas.repec.org/a/eee/econom/v87y1998i1p115-143.html.

BLUNDELL, R., S. BOND, and F. WINDMEIJER (2001). “Estimation in dynamic panel data models: Improving on the performance of the standard GMM estimator”. In: Nonstationary Panels, Panel Cointegration, and Dynamic Panels. Ed. by B. H. BALTAGI, T. B. FOMBY, and R. CARTER HILL. Vol. 15. Advances in Econometrics. Emerald Group Publishing Limited, pp. 53-91. DOI: 10 . 1016 / S0731 - 9053(00 ) 15003 - 0. URL: https : / / doi . org / 10 . 1016 / S0731 - 9053(00)15003-0 (visited on 04/30/2020).

BUNDESAGENTUR F U¨R ARBEIT (2013). Establishment History Panel (BHP). [Computer file]. Nu¨rnberg, Germany: Research Data Centre (FDZ) of the German FedJARMIN, R. S. and J. MIRANDA (2002). The Longitudinal Business Database. Working Papers 02-17. Center for Economic Studies, U.S. Census Bureau. URL: https: //ideas.repec.org/p/cen/wpaper/02-17.html.

KARR, A. F., C. N. KOHNEN, A. OGANIAN, J. P. REITER, and A. P. SANIL (2006). “A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality”. In: The American Statistician 60.3, pp. 1-9. DOI: 10.1198/000313006X124640.

KINNEY, S. K., J. P. REITER, and J. MIRANDA (2014a). Improving The Synthetic Longitudinal Business Database. Working Papers 14-12. Center for Economic Studies, U.S. Census Bureau. URL: https://ideas.repec.org/p/cen/wpaper/ 14-12.html.

25 references, page 1 of 2
Any information missing or wrong?Report an Issue