Lidea Feed

China's DeepSeek-V3 AI model can challenge the reign of OpenAI, know what is its specialty

News Update January 03, 2025 05:24 AM

DeepSeek-V3: A few days ago OpenAI unveiled its new o3 model, which gave rise to the debate of getting closer to Artificial General Intelligence (AGI) with amazing benchmark results. But during this time, the DeepSeek-V3 model of China's AI lab DeepSeek became the topic of discussion. This model is not only outperforming OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet in benchmarks, but is also revolutionizing the world of AI with low cost and high efficiency.

What is DeepSeek-V3?

DeepSeek-V3 is an open-source AI model, developed at a cost of only $5.5 million. In comparison, GPT-4o cost about $100 million to create. It is based on Mixture-of-Experts (MoE) architecture, in which multiple expert models work together. This model has 671 billion parameters, but only 37 billion are active for any given task, making it highly effective and cost-efficient.

The model is trained on high-quality data of 14.8 trillion tokens and uses techniques such as Multi-Head Latent Attention (MLA) and auxiliary-loss-free load balancing. Based on NVIDIA H800 chips, this model delivers great results despite limited resources.

Features (DeepSeek-V3)

Ability to understand long context: This model can process up to 128,000 tokens in a single context, making it great for areas like legal documents and academic research.
Multi-token prediction: It can predict multiple words at a time, increasing its speed by 1.8 times.
Open Access: DeepSeek-V3 being open-source makes it accessible even to small and medium developers, making it possible to compete with larger companies.

Display

In benchmark tests, DeepSeek-V3 outperformed competitors on MATH-500 and LiveCodeBench in areas such as mathematics and coding. In particular, its performance was excellent in Chinese language tasks. However, its real-time inference capability and English factual tasks are said to need improvement.

New twist in the race of AI

Competition between America and China in the AI field is intensifying. US sanctions have limited China's access to advanced NVIDIA AI chips, but DeepSeek-V3 has mitigated the impact of these sanctions. This model shows that high-performance AI models can now be built even without a large budget.

DeepSeek-V3 being open-source not only democratizes AI research, but it also presents a major challenge to closed source models.

DeepSeek-V3 has proven that new heights can be reached in AI even with limited resources. This model is not only challenging the dominance of giants like OpenAI, but is also writing a new chapter in the world of AI.