China’s leading researchers and policymakers are sounding a familiar alarm: generative AI that can swap, fabricate and animate faces is moving rapidly from crude image edits to realistic three‑dimensional reconstructions, and regulation and defensive technologies must catch up.
Dong Jing, a researcher at the Institute of Automation of the Chinese Academy of Sciences and an IEEE Asia‑Pacific executive committee member, describes the technical shift as a move “from photo retouching to sculpting” — from two‑dimensional pixel tinkering to 3D models that reconstruct a face’s bone, muscle and lighting relationships and can be driven to produce lifelike expressions and movements.
The technical advance brings clear benefits: greater realism, stability across viewpoints, and far more fine‑grained control over identity, expression, pose and illumination. Those same attributes, Dong warns, also make synthetic content more convincing and therefore more dangerous as a vehicle for fraud, disinformation and privacy invasion.
Dong’s group runs a dual programme: one team develops generative techniques that can “make people”, while another develops detection and multimedia forensics. She argues the two should be pitted against each other internally — a continuous attack‑and‑defend loop that hardens models and exposes vulnerabilities — and that is what her lab practices in order to understand both sides of the problem.
On the defensive side, detection remains feasible but technically challenging. Dong explains that even the most realistic synthetic images and videos leave subtle physical, geometric and temporal traces — inconsistent lighting, microstructure distortions or timing glitches — that algorithms can learn to spot. But these are weak signals, and as generative models close the gap, detectors face a persistent tug‑of‑war.
Practical detection is further complicated by the diversity of real‑world content: short, rapidly cut clips, noisy platforms with heavy compression, and the ease with which bad actors can iterate new generation techniques. Dong says detection often lags briefly behind novel generators, but so far defenders still hold a marginal advantage; over the long run she expects a shift from pure attack/defence to active provenance and compliance systems.
Her prescription is technical and institutional. At the source, commercial generative models should embed immutable provenance: digital watermarks, model fingerprints and generation logs that record who generated a file, when and with which model. During distribution, platforms should adopt unified verification APIs and flag AI‑generated content automatically. And crucially, legal frameworks must define responsibilities so provenance can serve as part of an evidentiary chain.
Dong also discussed practical privacy measures and the limits of biometric security. Technologies such as adversarial makeup — tiny, visually subtle patterns that disrupt automated face recognition — are maturing as research tools for privacy protection, but they are not yet robust across lighting, angles and device variance and raise regulatory and security trade‑offs. She forecasts that sensitive applications such as banking will move to multi‑modal authentication — combining face, fingerprint, voice, device and behavioural signals — rather than relying on any single biometric.
The interview arrives as Chinese policymakers, including a CPPCC member who warned about “deepfake‑driven fake information everywhere,” press for tougher rules. Dong’s views map onto a broader policy impulse in Beijing: combine technical standards, platform accountability and legal enforcement rather than rely exclusively on after‑the‑fact takedowns.
Finally, Dong raised a human dimension that matters to the industry’s future. She argues that women in AI bring crucial strengths — attention to detail, collaboration and ethical sensitivity — and encourages more female researchers to enter the field, marshaling long‑term commitment rather than short bursts of interest.
For international audiences, the takeaway is twofold. First, generative AI’s technical trajectory makes synthetic people more believable and harder to police, elevating the stakes for elections, financial fraud and personal privacy worldwide. Second, responses that tie provenance to platforms and law, and that combine detection with design‑level safeguards, are likely to become the global template for managing synthetic media’s risks.
