Ferrari targeted by a deepfake: a close call

Ferrari narrowly escapes a multi-million euro fraud

In the summer of 2024, a senior Ferrari executive received a series of WhatsApp messages from someone presenting themselves as Benedetto Vigna, CEO of the Maranello-based automaker. The voice was convincing, same intonation, same accent, same communication style. The “CEO” mentioned a confidential acquisition requiring an urgent fund transfer, and asked the executive to make the necessary arrangements immediately, stressing the strictly confidential nature of the operation.

The fraud could have succeeded. What saved Ferrari, according to the account published by Bloomberg News and picked up by Reuters, was a simple question. The executive, intrigued by certain unusual details, asked a personal question: a reference to a recent conversation (about a book) between himself and the real Benedetto Vigna, the context of which only the genuine CEO could have known. The impersonator, unable to answer, hung up. The transfer never happened. This quick verification tactic prevented a potentially costly scam, protecting Ferrari from a fraud attempt worth several million dollars.

The technology behind the attack: how voice cloning works

From text to sound: speech synthesis architectures

AI-powered voice cloning relies on text-to-speech (TTS) models trained on recordings of a target voice. Progress in this field since 2020 has been spectacular. Microsoft’s VALL-E model (introduced in January 2023) is capable of cloning a voice from a sample of just three seconds of audio, with sufficient accuracy to deceive human listeners in the vast majority of cases.

The process involves several steps: extracting a vocal “embedding”, a multidimensional mathematical representation of the target voice’s acoustic characteristics (fundamental frequency, formants, timbre, prosody). This embedding is then used to generate new utterances in the target voice from text input. The most recent models make the result even more convincing.

The accessibility of tools: a worrying democratisation

Many platforms offer voice cloning services with subscriptions accessible to the general public, sometimes starting at $5 to $22 per month. Some platforms have been mired in controversy after users employed them to create voice deepfakes of American political figures.

The material needed to build a vocal profile of a target is often freely available. For the CEO of a listed company like Benedetto Vigna, television interviews, speeches at public events, and investor presentations are all available online and sufficient to train a convincing cloning model.

Voice Conversion: real-time cloning

Another technique, known as Voice Conversion (VC), allows an attacker’s voice to be transformed in real time to give it the acoustic characteristics of the target voice. The fraudster speaks normally into their microphone; VC software processes the audio stream in a few tens of milliseconds and retransmits it transformed to the recipient.

Similar cases: Ferrari is not an isolated incident

Hong Kong, February 2024: $25 million lost

The most widely reported case of deepfake fraud in a corporate setting occurred in February 2024 in Hong Kong. An employee of an unidentified multinational was invited to a video conference call with what he believed to be his CFO and several colleagues. The employee carried out 15 transfers to 5 different bank accounts, totalling HK$200 million (approximately US$25 million).

The 2019 precedent: the first documented voice deepfake fraud

The Wall Street Journal documented in 2019 the first known case of voice deepfake fraud in a corporate context, involving the transfer of $243,000 to a fake “CEO” whose voice had been cloned with sufficient fidelity to deceive the subsidiary’s managing director.

Corporate security protocols: best practices

Out-of-band verification

The first line of defence against voice cloning fraud is out-of-band verification: any urgent request involving financial transfers, even if it appears to come from an identifiable senior executive, must be confirmed through a different communication channel from the one used for the initial request. If the request comes by phone, verification should be done by email to a known address, via a secure corporate messaging application, or by a second call to a number registered in official contacts, never to a number provided by the caller.

Shared verification passwords

The technique used by Ferrari’s executive, asking a question that only the real CEO could answer, is a form of “human password.” Companies have begun implementing pre-agreed “safe words” between close colleagues: a word or phrase agreed in advance, which either party can ask the other to say in order to confirm their identity during an unusual communication. Simple, effective, and requiring no special technology.

Certifying official content as a preventive protocol

A particularly effective approach involves certifying official corporate communications at the time of their production, using a tamper-proof certification solution. This is precisely what Certiphy.io enables: a cryptographic certification infrastructure that turns the authenticity of content into a fact verifiable by independent third parties, without relying on any centralised intermediary.

Conclusion: authenticity as a strategic asset

The Ferrari incident is not a minor technology news story. It is a wake-up call for the entire business world. In an environment where a CEO’s voice can be cloned from public interviews and where generation tools are accessible for a few dozen euros per month, trust in communications can no longer rest on perceptual familiarity alone. Authenticity is becoming a strategic asset that companies must actively manage.

Find our analyses on deepfake threats and available protection solutions in our news section. Does your company want to protect its communications against impersonation? Discover how Certiphy.io can establish an inviolable authenticity registry for your official content.