Local Differential Privacy (LDP) has emerged as a cornerstone for safeguarding privacy in data analysis, enabling robust statistical inference without the need for a trusted intermediary. However, current LDP methods often lack scalability and applicability beyond specific tasks. In response, we propose a novel approach for locally differentially private data synthesis. Our method extracts representative marginals from user data to construct synthetic datasets that maintain statistical fidelity while adhering to differential privacy constraints. A primary challenge lies in selecting meaningful marginals and minimizing noise in LDP mechanisms tailored to domain size of each marginal. To tackle this, we employ a sliding overlapping window technique to effectively capture low-degree marginals that encapsulate critical dataset attributes. Furthermore, we adaptively utilize categorical frequency oracles (CFOs) to estimate marginal frequencies based on their respective domain sizes, thereby reducing noise levels and maintain data utility. Empirical evaluations on real-world datasets, including Twitter and Adult datasets, underscore the efficacy of our approach in preserving data utility across a spectrum of fundamental statistical tasks.
Distributed Data Synthesis under Local Differential Privacy
Xiaoguang Li,Haonan Yan,Qingwen Li,Hui Li,Fenghua Li
Published 2024 in 2024 IEEE Smart World Congress (SWC)
ABSTRACT
PUBLICATION RECORD
- Publication year
2024
- Venue
2024 IEEE Smart World Congress (SWC)
- Publication date
2024-12-02
- Fields of study
Not labeled
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-37 of 37 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1