In a developing story that has captured the attention of the global artificial intelligence community, OpenAI — the Silicon Valley-based creator of ChatGPT and some of the world’s most advanced AI models — has accused the Chinese AI startup DeepSeek of using a controversial technique known as distillation to replicate and train on capabilities originally developed by U.S. AI research labs. This accusation was laid out in a memo sent to the U.S. House Select Committee on Strategic Competition between the United States and the Chinese Communist Party, according to a report.
What Is the Core of the Dispute?
At the center of this dispute is the accusation that DeepSeek developed methods to extract output from OpenAI’s proprietary models and then used those outputs to help train its own competing models. According to OpenAI’s memo, employees associated with DeepSeek:
-
Accessed OpenAI’s AI models, sometimes through obfuscated third-party networks that hid their actual origin.
-
Programmatically collected outputs from those models.
-
Used these outputs in ways that OpenAI claims are consistent with distillation training, a technique where a “student” model learns from the outputs of a more powerful “teacher” model.
Distillation itself isn’t inherently unethical or illegal in all contexts. It’s a widely used process in the AI industry that allows smaller or more efficient models to learn from larger ones. But OpenAI stresses that using outputs from its models to train competing AI systems violates their terms of service, and may represent an unfair exploitation of technology that U.S. firms have spent billions of dollars developing.
OpenAI argues this kind of approach is essentially “free-riding” on innovation created by American AI labs, and could potentially weaken safeguards and competitive advantages held by U.S. firms.
What Does “Distillation” Really Mean?

To understand the accusation, it helps to know what distillation actually is. In AI research, knowledge distillation is a method where a smaller “student” model learns to mimic the behavior of a larger “teacher” model by training on its outputs rather than on the original training data. This can dramatically speed up the process of building capable models without having access to expensive or proprietary training data.
In the simplest terms:
-
The teacher model is usually a large, high-performance AI system (like OpenAI’s GPT-class models).
-
The student model is trained to match the teacher’s responses to a set of queries, effectively learning its reasoning patterns and general behavior.
-
The result is a new model that can perform similarly to the teacher model but with less computational cost or smaller size.
While distillation is widely used in academic research and within open-source ecosystems, OpenAI’s terms of service specifically prohibit using its outputs to train competing models — something that’s central to the dispute with DeepSeek.
OpenAI’s View: A Threat to Innovation and Security
According to the memo seen by Reuters, OpenAI told lawmakers that DeepSeek’s activities seemed to involve “ongoing efforts to free-ride on the capabilities developed by OpenAI and other U.S. frontier labs.” The company reportedly believes that individuals affiliated with DeepSeek circumvented access restrictions and repurposed outputs from U.S.-based AI models for distillation.
OpenAI also warned that Chinese AI models — including those developed by DeepSeek — might be prioritizing rapid growth and market impact over the careful, safety-first development approach championed by U.S. AI firms. This has raised concerns among policymakers and industry experts about the potential for unsafe AI deployment and intellectual property issues.
The concern isn’t just about business competition. If DeepSeek’s models are being trained through outputs generated from proprietary systems, then national security, intellectual property rights, and the future direction of global AI development are all on the line. U.S. officials have indicated they are looking at the broader implications, which could include further regulatory or legislative action.
DeepSeek’s Rise and Market Impact

DeepSeek, based in Hangzhou, China, has gained global attention for its series of AI models — including the much-discussed DeepSeek-V3 and DeepSeek-R1 models — which have been praised by some for high performance and low cost compared to their Western counterparts. These models have been publicly available and widely deployed, contributing to concerns in Washington that China could accelerate its AI capabilities even under restrictions on technology exports and chip sales.
Indeed, DeepSeek’s rapid rise last year sent ripples through global tech markets, influencing share prices among major AI hardware suppliers and fueling discussions about the pace of innovation outside the U.S. While firms like Nvidia have remained dominant suppliers of AI computing hardware, DeepSeek’s advances raised questions about whether advanced AI development could thrive without top-tier resources.
Industry Reactions: Divided Opinions
As with many disputes in the AI space, voices from the industry have offered mixed opinions.
Some analysts and advisors, like U.S. AI and crypto czar David Sacks, have publicly backed OpenAI’s claims, stating that there is “substantial evidence” that DeepSeek used distillation in ways that violate terms of service. Others, however, argue that using output from publicly accessible APIs for research and training — even by competitors — is not altogether unusual in fast-moving fields like AI.
Still others point out that AI firms themselves have relied on publicly available information — including content crawled from the internet — to train their initial generations of models. This raises complex questions about where the line should be drawn between legitimate training practices and infringement.
Legal and Ethical Considerations
The situation shines a spotlight on broader legal and ethical debates in AI:
-
Intellectual Property Rights: Should companies be allowed to train models using outputs generated by other proprietary systems?
-
Fair Competition: Does deploying models trained through distillation give companies an unfair advantage, especially when terms of service prohibit such uses?
-
National Security: Are certain AI capabilities so strategically important that they should be protected by export controls or legislation?
At the moment, these questions remain unsettled. OpenAI’s memo to lawmakers is just one step in what could become a much larger conversation in U.S. Congress and international regulatory forums.
Looking Ahead
The dispute between OpenAI and DeepSeek highlights the complex, interconnected nature of modern AI development. While distillation itself is a powerful and common technique, its use — or misuse — in training competitive models without proper authorization raises fundamental concerns for creators, competitors, users, and policymakers alike.
Whether this issue results in legal action, new regulatory frameworks, or simply closer scrutiny of AI development practices, it has already underscored one thing: as AI continues to grow in significance, the rules governing innovation, competition, and ethics will have to evolve with it.
Dr. Mohammad Arif, holding a PhD from Chinese university, is an Islamabad-based senior analyst who frequently expresses his views on China related issues.
