DeepSeek is a Chinese AI research company making waves in the artificial intelligence sector with its cost-efficient, open-source models. It has garnered significant attention in the AI space. Initially, it was reported that their newest model, DeepSeek R-1, had only a $5.6M training cost—a fraction of the $100M+ usually required for large-scale AI models. However, there are things to consider potentially.
It likely does not include the full costs—R&D, failed training runs, infrastructure, and the teams behind it. AI models aren't trained in a single shot, and the cost of building a competitive AI company is generally far higher. The $5.6MM cost is more likely related to the final training run.
Inference is the actual cost driver. Training is a one-time expense, but running these models (inference) at scale makes AI expensive. That's where DeepSeek moves the needle.
Not every model costs $100M+. As Anthropic's CEO Dario Amodei pointed out, their latest Sonnet 3.5 model was trained for tens of millions, not hundreds. The idea that every competitive AI model requires a $100M+ budget is outdated—efficiency gains and improved techniques have made high-quality models cheaper and faster to develop.
This means DeepSeek's cost efficiency, but it may not be what some headlines suggest—it's part of a broader trend of AI models becoming more affordable to train over time.
Did DeepSeek use OpenAI's models to train R1?
Public reports suggest there is strong evidence that DeepSeek used distillation techniques—essentially training its model on outputs from OpenAI's models, potentially violating OpenAI's terms of service.
Distillation is common but legally murky. Many AI companies fine-tune their models using outputs from more advanced models, but when done at scale, this can become a gray area in terms of intellectual property rights.
OpenAI prohibits this. OpenAI's terms explicitly forbid using their models to train competing AI models. However, enforcement is difficult, especially across jurisdictions.
This raises questions for AI governance. As more companies build models this way, legal battles could emerge around what constitutes fair use in AI training.
What makes DeepSeek different?
DeepSeek, is a breakthrough in efficiency. It's using several well-known techniques to cut costs:
Mixture of Experts (MoE): Instead of running the entire model for every task, MoE selectively activates different parts, reducing computational overhead while maintaining performance.
FP8 (8-bit floating point processing): A lower-precision format that reduces memory and compute requirements without significantly impacting performance.
Multi-Token Prediction: Predicting multiple tokens at once speeds up inference but introduces a slight trade-off in accuracy.
Multi-Head Latent Attention: Keeps a smaller memory cache, optimizing efficiency while preserving model quality.
Does DeepSeek threaten OpenAI or Anthropic?
Initial indications suggest an impact but not a substantial threat to OpenAI or Anthropic.
OpenAI and Anthropic compete in frontier research, safety, and infrastructure, not just efficiency.
Efficiency is essential, but proprietary data, safety frameworks, and scaling ability matter more for long-term dominance.
This might actually push OpenAI and Anthropic to adopt similar optimizations, making their own models cheaper to run.
Will this hurt Nvidia?
Due to the DeepSeek news, Nvidia's market cap dropped by the largest single day in history on January 24th. The fear was that DeepSeek's efficiency improvements would lead to lower demand for AI chips, but history suggests the opposite.
Jevon's Paradox: When technology becomes more efficient, usage doesn't decline—it explodes. Making AI cheaper means more companies will deploy AI, increasing demand for compute.
Edge computing: If AI models become small and cheap enough to run locally instead of in the cloud, we could see more on-device AI applications, further driving demand for specialized chips.
Nvidia's moat is still strong. Even if inference costs drop, AI model training still requires massive computational power—and Nvidia owns that market.
What does this mean for U.S.-China AI competition?
DeepSeek is impressive but may not immediately change the fundamental dynamics of AI geopolitics.
Export controls are still a bottleneck. As Anthropic's CEO Dario Amodei pointed out, China's AI progress is still constrained by access to high-end chips.
Regulatory pressure could increase. DeepSeek's progress might push policymakers to tighten export controls, but it doesn't represent a fundamental shift in the AI arms race.
What are the unanswered questions?
Will OpenAI and Anthropic adopt FP8 and MoE, or do these techniques introduce trade-offs we don't fully understand yet?
Is DeepSeek R1's strong benchmark performance indicative of real-world generalization, or is it overfitting?
Will cheaper inference push AI toward local deployment, reducing reliance on cloud-based models?
Could these efficiency gains lead to AI becoming a commodity faster, or will companies with proprietary data still maintain an edge?
Will OpenAI pursue legal action against DeepSeek or set a precedent for how companies handle unauthorized model distillation?
9. Bottom Line: Should we be excited or concerned?
DeepSeek is an important step forward in AI efficiency, but it's not yet a game-changer for OpenAI, Anthropic, or the broader AI market.
Efficiency breakthroughs like this are inevitable and essential as AI matures. This isn't a disruption—it's an evolution.
It will make AI more accessible, but it doesn't fundamentally shift who's leading the AI race.
The real test is how quickly the major players adopt these optimizations and whether they introduce any downsides.
For now, DeepSeek and open source, in general, are worth closely watching, but it is too early to determine its full geopolitical, economic, or financial impact.
MARKET COMMENTARY
Any opinions, assumptions, assessments, statements or the like (collectively, “Statementsˮ) regarding market condition, future events or which are forward-looking, including Statements about investment processes, investment objectives, goals or risk management techniques, constitute only the subjective views, beliefs, outlooks, forecasts, projections, estimations or intentions of Allocate Management, should not be relied on, are subject to change due to a variety of factors, including fluctuating market conditions and economic factors, and involve inherent risks and uncertainties, both general and specific, many of which cannot be predicted or quantified and are beyond Allocateʼs control. Allocate undertakes no responsibility or obligation to revise or update such Statements. Statements expressed herein may not be shared by all personnel of Allocate.