Chinese company Alibaba from the artificial intelligence model with a new reasoning capability named Qwen with Questions (or QwQ) has unveiled a new competitor to OpenAI’s o1 model.
The model introduced by Alibaba contains 32.5 billion parameters and can respond to requests with up to 32,000 tokens. Like other large-scale reasoning models (LRMs), QwQ AI uses more computation cycles during its inference to check the answers it wants to provide to the user and correct mistakes.
For this reason, for tasks that require logical reasoning and planning, such as Math and Coding It is more appropriate.
Artificial intelligence performance of QwQ reasoner in mathematics and coding
According to the company’s tests, QwQ beats o1-preview in the AIME and MATH benchmarks, which evaluate the model’s ability to solve mathematical problems. It was also better than o1-mini in the GPQA benchmark (for evaluating scientific reasoning), but in terms of coding, the LiveCodeBench benchmark showed the o1 performing better, although the QwQ performance was better than other models such as GPT-4o and Claude 3.5 Sonnet.
Alibaba AI is currently in preview. With these conditions, we can say that a version with better performance will be released in the future. This company says about its performance in the statement related to the introduction of its model:
“Through our deep explorations and countless experiments, we discovered something very tangible: when we take the time to think, question, and reflect, the model’s understanding of mathematics and programming blossoms like a flower in the sun… This process of careful reflection and “Introspection leads to significant improvements in solving complex problems.”
Alibaba has not published any articles about the data or process it used to train its model, but considering that QwQ is an open source model (unlike o1), its “thinking process” is not hidden, and you can go to the text to understand how the model reasons when solving problems. it went
The company also pointed out that QwQ sometimes faces limitations such as combining languages or getting stuck in reasoning loops. You can try the trial version right now through Hugging Face.
RCO NEWS