Safeguarding against malicious deepfakes

03.02.2025

Humans seem innately wired to detect even the slightest imperfections in human likeness, a phenomenon known as the ‘uncanny valley.’ Simply put, we trust our own perception and intuition to guide us in deciphering what is real and what is not.

This is why early deepfakes—glitchy and riddled with errors like mismatched lip-syncs or jerky movements—were easily dismissed as harmless novelties.

Much has changed since.

The use of deepfakes is not always benign

Today, advances in synthetic media, especially generative AI, have made it cheaper, easier, and faster to create or manipulate digital content. For as little as $1.33, anyone can create a convincing deepfake.

Pair that with the speed, reach, and sheer scale of social media, and the stage is set for bad actors to exploit these tools to spread misinformation, disinformation, and malinformation.

Take British engineering giant Arup, which was scammed out of $25 million after fraudsters used a digitally cloned video of a senior manager to authorise a financial transfer in Hong Kong. Or “Anne,” a woman in France who lost her entire life savings—$855,000—to a romance scam involving an AI-generated Brad Pitt.

We may well have crossed the ‘uncanny valley’, where seeing is no longer believing

In such a world of ‘counterfeit people’, as philosopher Daniel Dennett put it, who can we trust online? Some warn that we are heading towards a future where shared reality no longer exists, and societal confusion runs rampant over which information sources are reliable.

So, how do we uphold the integrity of digital content? And are we adequately equipped to confront the rise of malicious AI-generated fakes?

Regulations, detection tools, and other approaches have been introduced:

1. Regulation

The EU’s Digital Services Act (DSA) introduces measures to combat malicious content, designating certain entities as ‘trusted flaggers’. These flaggers are responsible for identifying potentially illegal content and notifying online platforms. Once flagged, platforms must act swiftly to remove any objectively unlawful material.

In the US, several states have enacted laws targeting the misuse of deepfakes, particularly in cases of non-consensual pornography and election interference.

China has implemented strict rules requiring deepfake content to be clearly labelled, such that users can differentiate between real and synthetic media.

Meanwhile, Singapore has banned the use of deepfakes during election periods to prevent interference and manipulation of public opinion.

2. Context-based assessment of synthetic media content

A context-based approach involves evaluating deepfakes within the broader context of how and why they are being used. This framework helps regulators, platforms, and fact-checkers identify and address the most urgent threats, allocating resources effectively.

What kind of harm could the deepfake cause?
Examples include:
Financial fraud: Scams using AI-generated voices or videos to impersonate persons of interest, as seen in the Arup case.
Election interference: Manipulated political speeches or fake endorsements designed to mislead voters.
Reputation attacks: Fake videos or audio clips aimed at damaging individuals’ credibility.
Who is behind the deepfake, and what are their goals?
Expertise: Is the deepfake created by a sophisticated actor (e.g., a state-sponsored entity) or an amateur using off-the-shelf AI tools?
Motivations: Is the intent financial gain, political influence, personal revenge, or social disruption?
Resources: Does the bad actor have access to advanced AI tools, large datasets, and coordinated distribution networks?
How damaging could the deepfake be?
Assessing the potential damage helps prioritise responses.
Low: A harmless parody or meme with minimal real-world impact.
Medium: A misleading video that spreads disinformation but is quickly debunked.
High: A highly realistic deepfake used in a cybercrime scheme, national security threat, or large-scale manipulation campaign.

3. Adopting technological tools to detect malicious deepfakes

As the threat of deepfakes continues to intensify, so do efforts to develop new detection methods as a defensive strategy.

Deepfake detectors play a crucial role in the “trusted flagging” of harmful content. These tools analyse images, audio, and video for signs of manipulation—like lighting inconsistencies or unnatural facial movements—that may go unnoticed by the human eye.

4. Rigorous evaluation of deepfake detection tools

While detection tools evolve, bad actors are simultaneously refining their models, meaning deepfake attacks are evolving in lockstep with technologies designed to detect them. This necessitates continuous and rigorous testing at scale to ensure deepfake detector efficacy.

At Resaro, we offer assurance services and tools for ensuring content integrity. Check out our article, ‘The Generalisability Gap - Evaluating Deepfake Detectors Across Domains’, to learn how we used the DeepAction Dataset to assess the generalisability of open-source deepfake detectors across datasets and generation methods.

5. Public education

Even with robust detection tools, deepfakes will inevitably have a brief life online before being flagged and removed. Technology alone cannot solve the problem.

Public education is crucial. People need to be aware of the dangers of deepfakes and know how to protect themselves in this new, non-reality-filled world.

Protect your digital identity and be cautious of red flags like urgent money requests, or changes in tone, language, or style of communication.

When you consume a piece of content, be sure to verify, double-check, and triple-check—question the sources, confirm the facts, and make sure everything adds up.

Stay sharp by following trusted news sources, cybersecurity blogs, and AI experts. The Political Deepfakes Incidents Database is one effort tracking and exposing these threats.

Perhaps in a twist of irony, as deepfakes spread, so too will AI-generated warnings—one artificial voice calling out another.

Safeguarding against malicious deepfakes

The use of deepfakes is not always benign

We may well have crossed the ‘uncanny valley’, where seeing is no longer believing

1. Regulation

2. Context-based assessment of synthetic media content

Some dimensions to consider are:

3. Adopting technological tools to detect malicious deepfakes

4. Rigorous evaluation of deepfake detection tools

5. Public education

More Insights

Investing in Trustworthy AI: A Collective Commitment for Our Future

Testing the Performance of an LLM-based Search Assistant Application

Resaro joins Global AI Assurance Pilot by AI Verify Foundation in Singapore