November 19, 2020

Synthesized is Partnering with the Financial Regulator in the UK to Build Better Fraud Detection Models

Synthesized is Partnering with the Financial Regulator in the UK to Build Better Fraud Detection Models

The Challenge - Building a Better Fraud Model

Synthesized, after successful collaboration with the FCA and a leading fraud prevention vendor, is excited to have contributed to an initiative aimed at detecting and preventing fraud and scams exacerbated by the Covid pandemic.

The objective of the collaboration was to solve the challenge of building a better synthesised transactional bank fraud model. Synthesized’s cutting-edge AI data synthetisation technology was uniquely able to transform the original fraud data set. The result is a collaborative safe-to-share synthetic data set for use by participants in the Digital Sandbox Pilot, jointly launched by the FCA and City of London Corporation.

An Urgent Yet Complex Problem

Transactional bank fraud is a notoriously difficult and complex problem to address. Collaboration, even within the same company or government body, is often slow or impossible due to the highly controlled records. Given the urgency of this growing problem, it is imperative to quickly bring to market faster, safer and more collaborative data projects.

Why Synthesized?

Synthesized is well equipped to handle the leading fraud prevention vendor’s high volume complex bank transaction data, transforming it in real-time into an accurate, useful, highly representative synthetic data asset for use by participants in the Digital Sandbox Pilot. To help ensure the quality of output, Derek Snow, a Research Associate from The Alan Turing Institute, independently assisted the FCA in ensuring that the data being produced is of high quality. Among other things, Derek evaluated and tested the feature correlations, joint distributions, and predictability of the data. Synthesized successfully leveraged the platform’s generative adversarial modelling, differential privacy and other tactics, collaborating with the leading fraud prevention vendor on the original dataset. The original dataset contained 5 million rows and 724 columns representing real bank digital payments.

Synthesized automatically derived the ‘deep’ statistical properties of the original dataset in order to create a highly representative version for collaborative purposes. The enhanced data project produced by Synthesized has two key advantages:

  • Cyber Safety by default - the synthetization engine creates an entirely new data project that, while representative of the original, cannot be traced back to the original. Therefore, it is not vulnerable to linkage attacks and is safe for use by participants in the Digital Sandbox Pilot.
  • High Quality Data - as vetted by a researcher from the Turing Institute, the new data project produced was highly accurate and renders the same properties as the original. This means all of the fraud solutions and tactics tested will be relevant to the original, and to real-world transactional fraud.

Simon Swan, the lead machine learning (ML) engineer from Synthesized for this project noted, “Datasets of this size and complexity offer interesting and insightful challenges for Synthesized’s core engine. It is always exciting to see just how powerful our platform is when we are able to successfully recreate new, highly detailed data at such scale, as was the case with this collaboration.

Nicolai Baldin, CEO, agrees, saying that “Synthesized was delighted to be involved as development partner on this game changing fraud initiative. It is further validation of both our product leadership and unique data capabilities in the data privacy & secure collaboration space, and our depth of in-house expertise employing the latest ML techniques within our platform.

For more information on the project, please visit it at the FCA Digital Sandbox:

For more information on the Synthesized Data Platform, please request a product demo.