In 2018 alone, data breaches exposed 5 billion records, and data sharing with unauthorised third parties was one of the most common causes of many of these breaches.
To counteract the risk, most companies generally use anonymisation techniques - a method that involves just masking sensitive information such as income, date of birth, address - to share data. But these trade off data quality with data privacy, with estimates suggesting anonymisation techniques only retain 70% of the prediction accuracy of the original data. As hacking methods get more sophisticated, these techniques are starting to look limited.
In most data-driven projects, it is not necessary to use original data and risk consumer’s privacy and sensitive information. Synthesized data is privacy-compliant by design but retains more than 95% of the prediction accuracy.
In less than two years the team of Cambridge-educated PhDs, machine-learning engineers and technologists have created a product ready for enterprise use and received significant traction in the UK and Europe. More insurance companies, banks and financial services firms are likely to be the ‘power users’ of this technology given the volume and nature of the sensitive and private data they hold. Large and complex organisational structures mean it can take up to four months to get hold of a data for any given project. Challenged by new fintech and insurtech offerings, large insurers and banks are looking for new tech offerings that can help them face up to the new nimble startups.
Synthesized was selected to be part of Google for Startups 2019 Residency programme which spotlights startups using machine learning technology to make a positive social impact.
Marta Krupinska, Head of Google for Startups UK said: “At Google for Startups we are committed to supporting companies who are using tech for good. Synthesized is an excellent example of a company doing just this; unleashing data’s full potential whilst also protecting people’s privacy. We are excited to watch their journey as one of the 2019 Residency cohort and are looking forward to continuing to work alongside them as they grow.”
Nicolai notes “We want data to be an asset for companies without compromising the trust that consumers have placed in them. The future of data is the preservation of knowledge and insight and we are helping companies do that safely”
Synthesized released its deep-learning-driven system in April for generating synthetic data that mimics the structure and trends of original datasets without disclosing any information about each individual in a move that will revolutionise data sharing security.