DP-FY YOUR DATA: HOW TO (AND WHY) SYNTHESIZE DIFFERENTIALLY PRIVATE SYNTHETIC DATA

ICML 2025 TUTORIAL ยท VANCOUVER, CANADA

Natalia Ponomareva, Sergei Vassilvitskii, Peter Kairouz, Alex Bie

Google

Vancouver Convention Center

New The companion SURVEY ARTICLE: How to DP-fy Your Data: A Practical Guide to Generating Synthetic Data With Differential Privacy is available here: https://arxiv.org/abs/2512.03238. The article offers an extended treatment of the topics covered in this tutorial.

๐Ÿ“น RECORDING

6 sections, 2h15m

๐Ÿ“„ SLIDES

PDF, 9.7MB download

๐Ÿ’ป DEMO

Generate DP synthetic data in Colab

ABSTRACT

This tutorial focuses on differentially private (DP) synthetic data generation. Creating DP synthetic data allows for data sharing without compromising individuals' privacy, opening up possibilities for collaborative model development. This tutorial provides a comprehensive guide for generating DP synthetic data across text, image, and tabular modalities.

This tutorial covers: an introduction to synthetic data and differential privacy; specific generation methods for various data types (text, image, and tabular); as well as practical aspects of real-world deployments such as user-level guarantees, empirical privacy testing, and data lineage.

CONTACT

For any questions or comments about anything here, please email us: nponomareva@google.com, sergeiv@google.com, kairouz@google.com, alexbie@google.com.