How the European Commission Used Synthesized To Protect Sensitive Data

Synthesized TDM Customer University Manchester

Overview

Financial regulators across the EU possess datasets that could unlock significant innovation in Europe's financial sector. However, this information is confidential and cannot be shared with third parties. The Commission needed a way to enable data sharing that would drive innovation without compromising privacy or regulatory compliance.​

Test Data Challenges

Traditional point tools and Test Data Management (TDM) methods of protecting sensitive data (like removing names, adding noise, or encryption) all shared a critical flaw: they could be reversed if someone knew the technique used.

The Commission needed a solution that would:​

  • Keep all real data completely secure within the national supervisors' premises​
  • Provide realistic datasets that organizations could actually use for testing and analysis​
  • Deliver unbreakable privacy protection​
  • Work simply without requiring specialized technical expertise​
  • Ensure full compliance with GDPR and other data protection regulations​

The Solution

The European Commission selected Synthesized through a public tender, with rigorous testing conducted by the Joint Research Centre (JRC). Synthesized uses AI to create entirely new datasets that statistically mirror the original data without containing any real information. The original observations simply don't exist in the generated data, making it impossible to reverse-engineer or identify real records.​

Rigorous testing across three diverse financial datasets confirmed that the solution delivered both strong privacy protection and analytical accuracy. When synthetic data was used in banking loss simulations, results closely matched those from real data. Security testing showed that attempts to identify the original records failed, and fewer than 10% were identified.​

"As opposed to traditional anonymization techniques, the synthesis process makes any potential attack virtually impossible. This is because instead of keeping an anonymized version of the original data, the synthesis strips it to a limited set of parameters, which is then used to build up a completely new dataset with similar statistical properties. This makes any identification of individual original observations impossible, because they are simply no longer there."

Results

Unlocked Innovation: National supervisors can now participate in the Data Hub without exposing confidential information, removing barriers to fintech innovation across Europe.​

Uncompromising Privacy: Independent validation confirmed the synthetic data provides "virtually impossible" to break privacy protection while maintaining usefulness for testing and AI model training.​

Operational Efficiency: Synthesized was easy to use, generated datasets in minutes using simple commands, and required no specialized infrastructure.​

Stronger Collaboration: The Data Hub now enables meaningful partnerships between financial regulators and innovative firms.​

Conclusion

Synthesized enabled the European Commission to build a Data Hub that drives innovation while maintaining the highest data protection standards. This breakthrough allows supervisors to support fintech advancement and helps innovative firms access the data they need to scale across Europe—all without compromising individual privacy.​

The solution directly supports the Commission's goal of building trustworthy data-sharing systems that accelerate innovation in Europe's financial sector.

Read the European Commission's independent whitepaper

Synthetic data in the Data Hub of the Digital Finance Platform