• Tech Rundown
  • Posts
  • DeepSeek's $5M AI Model Matches OpenAI: When Export Controls Backfire

DeepSeek's $5M AI Model Matches OpenAI: When Export Controls Backfire

How U.S. export controls accidentally created their strongest AI competitor, and why this might be good news for Big Tech

In partnership with

Thank you for your support on running ads, these let me continue to deliver high-quality analysis for free, in perpetuity.

Stay up-to-date with AI

The Rundown is the most trusted AI newsletter in the world, with 1,000,000+ readers and exclusive interviews with AI leaders like Mark Zuckerberg, Demis Hassibis, Mustafa Suleyman, and more.

Their expert research team spends all day learning what’s new in AI and talking with industry experts, then distills the most important developments into one free email every morning.

Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.

In a development that has caught the attention of the AI industry, Chinese AI company DeepSeek has launched r1, a new AI model that matches the performance of leading U.S. models while supposedly only costing $5M to train.

The Breakthrough

Whilst relatively unknown in the US & Europe DeepSeek is far from being a small startup, and is actually one of the leading frontier labs with over 100 full-time researchers. 

Since their start in 2023, they've documented their progress through 16 published papers, beginning with retraining the Llama model. 

Whilst some experts like Dylan Patel (Semianalysis) & Alexander Wang (Scale AI) have suggested that Deepseek illegally bypassed US export controls and have a 50k cluster of GPUs it is actually possible for the $5M to be accurate… sort of. 

This $5M figure refers specifically to GPU costs for one training run, however, their total infrastructure investment is estimated at $500 million, including 10,000 A100s and 2-3,000 H800 chips plus more in salaries and other training runs. 

While this is still substantial, it's notably less than their U.S. counterparts—particularly when you consider that some Meta, Anthropic & OpenAI teams have compensation budgets exceeding DeepSeek's entire training costs.

How They Did It: Innovation Through Constraint

The story of DeepSeek's success begins with U.S. export controls that prevent Chinese companies from accessing the latest generation of AI chips. Rather than hindering progress, these restrictions became a catalyst for innovation. Limited to older-generation hardware, DeepSeek's team was forced to develop novel solutions to overcome bandwidth limitations and other constraints that came with using less advanced chips.

This speaks to a broader issue in U.S. tech development over the past 15 years. While U.S. companies have often approached problems by throwing hundreds of billions of dollars at them, resource constraints forced DeepSeek to think more creatively. They fundamentally rethought how AI models are built, developing more efficient methods for both training and running their models.

Through careful reward modeling and reinforcement learning, they managed to create models that could reason effectively without relying on enormous supervised datasets.

Market Implications

The impact of DeepSeek's breakthrough extends far beyond technical achievements. The AI landscape is shifting, with DeepSeek becoming the third Chinese app to top the App Store in recent months, alongside Rednote and TikTok. This has prompted swift responses from competitors—OpenAI's Sam Altman has increased availability of their reasoning models like o3 to maintain user engagement, prioritizing access over short-term revenue. Plus tier users will now get 100 o3 queries per day, with plans to bring their Operator model to the plus tier as soon as possible.

For the broader tech industry, this development presents an intriguing paradox. While some internet personalities suggest this spells trouble for major tech companies, the reality may be quite different. The ability to develop leading AI models through efficient engineering rather than massive spending could actually benefit companies like Microsoft, Google, and Meta who are currently contemplating Capex investments of $50-80b per year. 

The implications for Nvidia are particularly noteworthy. While their near-term outlook remains secure due to existing contract agreements for the next two years and ongoing hyperscaler projects like Project Stargate, the long-term picture is more complex. The shift toward more efficient AI development could eventually impact their high-margin GPU business model, especially if companies can achieve high performance with fewer specialized chips. Nvidia’s share price was down 5% on the news, but is now only down 3%. 

Challenges and Considerations

While DeepSeek's model is available with open weights, it's important to note it's not fully open-source. There's limited access to training and data processing code, and information about the training data remains scarce. While their papers provide more detail than most frontier model papers, key details about which factors matter most are still unclear.

Looking Ahead

DeepSeek's achievement suggests a new phase in AI development where efficient engineering may matter more than raw computing power. Their success is likely to spark a new wave of companies focusing on specialized training and reinforcement learning, with stronger emphasis on efficient pipelines rather than pure computational power.

Given the reproducible nature of their innovations, we can expect major reproductions from companies like OpenAI and Anthropic within the next 3-6 months. The focus of capital investment may shift from training to inference infrastructure, with companies rushing to establish data centers and secure subsidized energy sources worldwide.

This development may also serve as a healthy correction for the AI market, allowing investors to price in competition and commoditization more accurately. The battle in AI usage is evolving from training to inference, potentially reshaping the industry's infrastructure needs and investment patterns.

DeepSeek's breakthrough demonstrates that the race to advanced AI is becoming more equitable, with innovation potentially coming from unexpected places and approaches that prioritize efficiency over brute force computing power. This could lead to a more diverse and competitive AI landscape, ultimately benefiting the entire industry through increased innovation and efficiency.