Meta has taken a major step towards developing multifaceted and open text by introducing a set of new models of the LLAMA 4 family. These models, including the LLAMA 4 Scout, LLAMA 4 Maverick and the LLAMA 4 Behemoth trial, provide a different and powerful experience of artificial intelligence for developers, businesses and public users.
Scout and maverick; Two leading models with MOE architecture
The LLAMA 4 Scout is a model with 2 billion active parameters and 2 experts that runs on only one H100 graphics card while being powerful. This model is superior to models such as Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 in terms of performance.
On the other hand, the LLAMA 4 Maverick, with the same number of parameters but 2 experts, offers performance even beyond the GPT-4O and Gemini 2.0 Flash. Designed for applications such as artificial intelligence and advanced conversations, this model is designed for applications such as artificial intelligence assistants and advanced dialogues, earning a score of 2 in the LMARNA test.
Behemoth; Teacher Model 2 trillion parameter
The LLAMA 4 Behemoth, which is still in the training phase, has been introduced as one of the most advanced language models in the world with 2 billion active parameters, 2 experts and a total of nearly two trillion parameters. This model has a better performance in STEM assessment than Gpt-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro and is used as a teacher model for Scout and Maverick.
Innovation in education and architecture
The new LLAMA 4 models are designed using the architecture of the MixTure of Experts – MOE, in which each token activates only part of the parameters. This method has increased efficiency and reduced processing cost
These models are also inherently designed, so that they process text, image and video seamlessly. The use of the Early Fusion structure has made it possible to integrate textual and visual information from the beginning.
Advanced training with extensive and accurate data
The LLAMA 4 is based on data from more than 5 trillion token, including various texts, images and videos. Ultra -light precision techniques, online strengthening learning and direct preferences optimization (DPO) have also been used for final training.
In order to maintain the balance between the ability to reason, dialogue, and respond to multifaceted inputs, the post -pre -processing training process has involved eliminating easy data and focusing on challenging questions. This method has increased the accuracy of the model, especially in the areas of coding, mathematics and reasoning.
Special capabilities Scout
With a 5 million token text window, the Scout has launched new horizons for extensive information processing such as summarizing multiple documents, analyzing extensive user activity, and reviewing complex codes. This model also performs high performance in video tests and is capable of answering accurately to visual questions.
Protection, safety and tackling threats
Meta has emphasized the importance of security and protection of users, with the development of new models. The company introduces tools like:
LLAMA GUARD has attempted to provide a safe and reliable environment for developers and users to detect language attacks such as Prompt Injection and Jailbreak, and CyberseVal assessment tools to detect language attacks such as Prompt Injection and Jailbreak and CyberseVal.
The future of Llama; The beginning of a new path
Meta intends to provide an open and expandable hospital for the next generation of artificial intelligence by presenting the LLAMA 4. The company believes that smart models should be able to naturally interact with humans, take public actions and solve new issues. The Llamacon event, which takes place on April 7, is set to further outline the ecosystem.
RCO NEWS