February 10, 2021

The promise of synthetic data

Each year, the world generates exponentially more data than the previous year.
According to the International Data Corporation, in 2020 alone, an estimated 59 zettabytes of data will be “created, captured, copied, and consumed”, but it will not all be available for usage.

Companies and institutions are concerned with their users' privacy. To be compliant with privacy laws, terms of use and other artfully crafted legal things, parties restrict access to datasets — sometimes even within their own teams. Unfortunately Covid-19 has added an extra layer of difficulty to accessing and sharing data securely: labs and offices are closed, which prevents access to centralised data stores. So how do we solve this problem and balance the industry’s need for quality data and the public’s right to privacy? Is synthetic data the silver bullet for our growing list of struggles with the massive amounts of data we create? Does synthetic data provide users with the true essence of data without the regulatory headaches, privacy concerns and worry about bias?

Nicolai Baldin, founder and CEO of Synthesized, had the pleasure of joining the Tech On Reg Podcast hosted by Dara Tarkowski to talk about the value synthetic data brings to the industry. The podcast explores all things at the intersection of law, technology and highly regulated industries. During the 30 minute session they cover the following issues:

During the 30 minute session they cover the following issues:

  • Why tackle this problem?
  • Is synthetic data still based on real data?
  • The vulnerability of anonymised data
  • The value of synthetic data beyond preserving data privacy