What Distillation and Model Weight Theft Means for the Future of U.S. AI Innovation & Policy
Last week, ARI held a panel with technical and policy experts on AI model piracy – stealing AI IP – and specifically distillation, a technique by which the performance and capabilities of advanced models can be replicated in smaller models at lower cost.
Distillation has gotten a lot of attention ever since DeepSeek’s breakthrough earlier this year wiped some USD $1 trillion from the S&P 500. While it is still unconfirmed whether Chinese AI startup DeepSeek actually used distillation on U.S. developers’ proprietary AI models, some suggest it’s likely.
So how should policymakers and the private sector react to this new threat? Our panel of experts weighed in with a few big ideas on how to protect the U.S.’s edge in AI development. Here are our top three takeaways.
Background: What is distillation and how might DeepSeek have used it?
First, some quick background. Knowledge distillation is a machine learning technique whereby select knowledge of a large “teacher” model is transferred over to a more compact “student” model. This can be integral for developing real world applications of AI – large teacher models can be impractical due to operational costs, hardware limitations, and potentially slow inference times, among other things.
Broadly, there are two ways to achieve distillation. White-box distillation requires having complete access to a teacher model’s parameters – the numerical values that collectively encode learned knowledge and capabilities – enabling direct knowledge transfer to a student model by aligning its internal representations with those of the teacher model. Black-box distillation involves only utilizing a teacher model’s input and output capabilities – for example, collecting input-output pairs through an interface such as an API, and then using these responses to fine tune a student model. Importantly, only black-box distillation may be utilized when a teacher model’s parameters are inaccessible. Major AI developers such as OpenAI may use both white-box and black-box distillation in the development of their commercial AI applications.
If DeepSeek distilled from proprietary U.S. AI models to develop their breakthrough R1 model, presumably it would have been achieved through black-box distillation specifically – DeepSeek wouldn’t have had access to those teacher models’ parameters. In late 2024, Microsoft security researchers reportedly detected certain data exfiltration patterns through OpenAI developer API accounts which may be linked to DeepSeek. If true, this means that DeepSeek systematically queried OpenAI’s models via their developer API, collecting large volumes of outputs to serve as training data for R1.
Recommendation 1: Export Controls & Remote Access Controls
Contrary to what some have suggested, DeepSeek’s success does not show that the U.S. export controls regime on AI chips has failed or is necessarily misguided. DeepSeek’s own CEO Liang Wenfang has stated that, “The problem [DeepSeek is] facing has never been funding, but the export control on advanced chips.”
As RAND Information Scientist Lennart Heim noted on our panel, U.S. export controls on AI chips would not immediately show their impact after being implemented. Semiconductor technologies operate on extended development and deployment timelines, with some regulatory impacts manifesting gradually throughout industry adaptation cycles rather than as immediate disruptions. Given that the U.S. export controls regime on advanced AI chips was only robustly implemented in October 2023, its most substantial effects on China’s AI development efforts may only begin to reveal themselves from around now going forward.
What’s more, the shelf life of semiconductors used for AI training is around 3 to 4 years, meaning that chips acquired by Chinese AI developers prior to the October 2023 export controls update could still be serviceable for another few years.
The true impact of the U.S. export controls regime on AI chips may only be observed in the next phases of AI development. These will continue to involve the prevailing paradigm of AI chips optimized for model training – such chips are the current focus of US export controls efforts. However, in addition to chips used for model training, there will also be a strong focus on chips optimized for inference compute to achieve AI capabilities deployment at scale. Leading-edge AI chip designers – such as Nvidia, Groq, AMD, Intel, and Cerebras – increasingly are developing specialized chips for inference, such as Nvidia’s new Blackwell series. These inference-specialized chips will likely be at the center of future U.S. export controls efforts.
The implication for policymakers is that there are compelling reasons to maintain the U.S.’s export controls regime on AI chips, and also to ensure that the Department of Commerce’s Bureau of Industry and Security can effectively enforce these export controls. Moreover, as the focus of the global AI race shifts to include access and control over AI models themselves, policymakers should consider optionality-enabling legislation such as the ENFORCE Act and the Remote Access Security Act.
Recommendation 2: (Cyber)security, Know-Your-Customer, & Developer Transparency
The spectre of distillation and the success of DeepSeek have also highlighted the importance of cybersecurity, controlled access, and transparency in cutting-edge AI development. As CNAS Senior Fellow Janet Egan pointed out on our panel, one way for AI developers to address the risk of AI IP exfiltration is via robust Know-Your-Customer (KYC) methods, which facilitate usage monitoring and traceability. Additionally, given the geopolitical stakes of the global AI race, it is vital that AI developers and the U.S. Intelligence Community collaborate to defend against foreign espionage and American AI IP theft. In the long run, it has been argued that AI labs could become targeted by foreign state actors seeking to obtain proprietary model parameters and other valuable AI IP, necessitating public-private partnerships to establish enhanced security protections.
Policymakers can help address these issues through several avenues. They can encourage or mandate that AI developers, compute providers, and hyperscaler cloud service providers institute KYC practices, similar to what the Biden administration’s Executive Order 14110 sought to do. Policymakers can also push for transparency disclosures from leading AI developers about their security policies and protection methods for sensitive AI IP, which could motivate robust cybersecurity practices, as noted on our panel by AI safety researcher Stephen Casper.
In a similar vein, policymakers could leverage the federal government’s cybersecurity and intelligence ecosystems – such as the Cybersecurity and Infrastructure Security Agency’s Joint Cyber Defense Collaborative – to facilitate public-private security partnerships with AI developers.
Lastly, given the national security risks of cybersecurity threats by foreign state actors, policymakers can also work to ensure that security-by-design practices are built into critical scheduled domestic AI infrastructure.
Recommendation 3: Permitting Reform, Infrastructure Expansion, & Manufacturing Capability
While distillation and AI IP theft are serious issues in their own right, they also remind us that in the long run, whether the U.S. sustains its edge in AI development will strongly depend on the ability to develop sufficient infrastructure for AI at speed. A U.S. AI victory will likely require streamlined datacenter build-outs, massively expanded energy production, and domestic semiconductor manufacturing capability. For instance, it is estimated that by around 2027 U.S. AI developers will require roughly 50 GW of additional energy to power their development and deployment efforts – the energy demand for powering roughly 40 million U.S. homes.
Policymakers have a critical role to play in facilitating U.S. AI infrastructure expansion. Permitting reform, especially for energy infrastructure, remains a critical bottleneck, but can be addressed through legislation such as the Energy Permitting Reform Act of 2024. The CHIPS & Science Act has laid the groundwork for bringing semiconductor manufacturing back to the U.S., and policymakers can look to further sustain this effort. Finally, policymakers can explore how to incentivize domestic AI data center development.
The Bottom Line
Ultimately, the challenges posed by distillation and AI IP theft brought to light by DeepSeek’s debut are a wake-up call for both policymakers and industry leaders. To safeguard its technological edge, the U.S. must act decisively through a combination of export controls, enhanced cybersecurity measures, and accelerated infrastructure development. As global competition for AI development intensifies, maintaining an edge will further require not just defensive strategies but also proactive investments in innovation and global leadership on issues like standards development.
With pirates taking to the high seas of AI development, U.S. policymakers and private sector leaders must chart a bold course for AI policy or risk seeing the U.S. lead hijacked by adversaries.