How to eliminate the risk of data leakage from your non-production environments

Warning: this is probably happening in your business right now...


Data leakage is a known risk in most organisations today. It's defined as the unauthorised disclosure of sensitive data to an external partner or agency.


When we hear the term data leakage we often think of it’s ugly partner: the data breach. Data breaches make headline news on a near-daily basis. Like the Information Commissioner’s Office (ICO) fining British Airways (BA) £20m for failing to protect the personal and financial details of more than 400,000 of its customers, and the £18.4M fine the ICO handed Marriott International Inc. for failing to keep millions of customers’ personal data secure.


The risk of heavy fines, the immense reputational damage, and even the possibility of executives going to jail are every bit as real with data leaks. 


This is not a story of hacker trickery or corporate theft, rather it is the story of innocent employees making simple, common, and unintentional mistakes. The developer tackling a mischievous bug who posts a code snippet to Stack Overflow... which also happens to contain sensitive customer details or the email a test team accidentally sends to the wrong address containing copies of sensitive production data meant for the QA team in Singapore. A cloud security config change that goes wrong and exposes the development file repository to the world. These, and plenty more, are the common patterns exposing every company and institution to the risk of data leakage.


But why does data leakage really happen and what can be done, I hear you cry?


Risk from data leakage exists because every large development initiative needs, and eventually receives, approval to use a copy of production data. Yep, the holy grail, that must be kept secure at all times of ever so sensitive production data… released from our hardened and secure production environments into non-production environments.


“But we’re safe!” cry some data security teams, “We use techniques like anonymisation and tokenization to protect the privacy of data before it’s shared.” 


But these approaches are often applied in an ad hoc manner and are proven to be inefficient at delivering real data privacy protection. Not true, you say? Have a read of this recent article by some wise folks at Imperial College London.


However, there is no need to worry, help is at hand. Synthesized has developed a powerful and unique solution to solve this problem. The Synthesized DataOps Platform generates AI-powered intelligent data, at any volume, that looks and performs exactly like the original data, but which are completely new data points that did not exist before. Synthesized contains no 1:1 linkage with the original data meaning it cannot be reverse-engineered back to the original. Synthesized data is designed to meet the most demanding data privacy policies and regulations like GDPR, HIPAA and CCPA, while providing the highest degree of utility and performance possible on the planet today. 


No direct access to live production data is required. A common configuration pattern sees Synthesized using data from a data warehouse or database that holds a copy of data from a production system (e.g. an end-of-day copy). It only takes hours to deploy and comes with a promise that the risk of data leakage is eliminated.


Synthesized can also generate intelligent data scenarios at any volume and is easy to scale up or down based on your requirements. You can easily rebalance and augment data to create data for any test scenario, including edge cases where original data may not even exist.


Our powerful automation capabilities mean Synthesized delivers impressive cost savings by reducing the manual effort required to create secure data by up over 90%.


Our FinTech Consulting partners Nextwave are helping us deliver the Synthesized solution to financial service institutions across Europe. Phil Kent, Partner at Nextwave, offers this advice:


“As a former banking CIO, one of my biggest concerns that kept me awake at night was fear of production data leaking from test and development environments and the franchise damage that would follow. Even with USB port disabling, email data leakage traps and anonymisation techniques the number of leakage vectors grows exponentially over time. Synthesized changes the game completely and makes sure all of the development and test activities can continue with production-like data without ever letting that data out of its secure production confines. I would challenge any risk acceptance of a production copy knowing the capabilities and effectiveness of this platform. ”


We really have mastered and solved the problem of data leakage.

Drop us a line if you’ve experienced such challenges, we're standing by and ready to help.


You may find useful these resources:

Enable best data practices with AI-powered data assets

BBC Digital Planet talk on data sharing in a privacy-preserving manner


Related posts

Synthesized blog

Learn what we've been up to

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.