January 31, 2025 - DeepSeek’s R1 model gave a shockwave to the world because it performs nearly as well or better than the current top AI models, like OpenAI’s 01-1217 and 01-mini models, in reasoning tasks like math problems, coding and knowledge-based questions.
The impressive aspect lies in its ability to achieve high performance at only millions of dollars of the running costs needed for similar functions that cost billions!
And DeepSeek requires less human feedback to check the answers while training, it shows a way that AI could learn even more than human can check, unlike other AI training that is human resource intense.
Just as the world is still surprised to DeepSeek’s R1, Alibaba (BABA, Financial) introduces another AI contender: Qwen 2.5, that is claimed to do even better in some ways. Here’s how Alibaba’s Qwen 2.5 is claimed to be better than DeepSeek’s R1:
Technical Superiority:
In a reasoning test using Arena-Hard, Qwen 2.5-Max achieved 89.4% accuracy, and the result was higher than DeepSeek R1 and when tested on other benchmarks of coding and scientific reasoning, Qwen 2.5 also scores higher.
Cost Efficiency:
DeepSeek’s operations cost is low, but Qwen 2.5-Max’s is lower and only priced at $0.38 per million input tokens, making it affordable for small businesses and startups.
Multimodal Capabilities:
Qwen 2.5-Max is able to analyze lengthy documents and videos in a single pass and processing not only text, image, audio, but also video formats in 29 languages, including Mandarin, Arabic, and Hindi.
Qwen 2.5-Max is positioned for enterprise clients. Despite the strengths, Qwen 2.5Max is underperformed in creative writing tasks if compared to other AI models.