The FCA Digital sandbox, announced in May, will "provide enhanced support to innovative firms tackling challenges caused by the coronavirus (Covid-19) pandemic." A key focus is to enable financial services firms, along with a network of public and private bodies, to collaborate and readily share data to identify perpetrators of fraud.
As with any data-driven initiative, access to data is the single biggest requirement to drive innovation in any company or organisation. The financial firms all have plenty of data, but in a heavily regulated industry like banking, making data assets available and safe to share is anything but simple.
GDPR is the regulatory kingpin front and centre of mind for any industry working with data. Great care and diligence are required to avoid falling foul of it’s strictures, especially with maximum GDPR fines of €20 million or 4% of the organisation’s turnover. And while most financial services companies have invested heavily in GDPR compliance, there was still the challenge of how to guarantee data privacy when coupled with the requirement of collaboration.
A foundational feature of the FCA Digital sandbox, the data was a combination of transactional banking data, SME lending data and customer accounts, meaning it was full of personal information including account numbers, emails, login details and so forth. While the data was obfuscated at source using hashing, it’s not a 100% guarantee of data privacy because hashed data still has 1:1 mappings to real entities and therefore still carries a degree of risk. Indeed, in general terms hashing is not sufficient to convert “personal data” into unidentifiable data so does not fully satisfy GDPR requirements.
So how do you take sensitive banking data, in this case 5M real payment transactions, and remove all possibility of identifying personal information while preserving full utility of all the underlying patterns and knowledge the data contains?
Synthesized was founded in 2017 by Dr Nicolai Baldin during his transition from academia to working with public bodies in the UK. While pursuing his PhD in Statistics and Machine Learning at the University of Cambridge, he identified the significant gap that prevents all companies from innovating with data at speed, and created the Synthesized platform to bridge this gap.
The Synthesized data platform is able to form a “deep understanding” of the information represented in data. The platform harnesses this expert knowledge to generate a sophisticated AI model to power the fast generation of new datasets. All with the same utility and performance characteristics as the original but with one key difference: none of the original data remains meaning Synthesized datasets are 100% GDPR (HIPAA, CCPA) compliant, fully satisfying the most demanding data privacy and security requirements.
With the power of Synthesized the problem of securely sharing and collaborating data was solved, enabling the FCA and participating banks to start investigating and tackling fraud during the pandemic.
The FCA should be applauded for the early decision to avoid the fundamental blocker of all data-driven innovation initiatives: timely access to high quality data assets.
The message to fraudsters is simple: the hiding game is over!
And now, back to the election news...