According to a report published on July 5, 2009 by the New York Times, Meta plans to stop the Llama 4 Behemoth project, the most advanced and largest model of artificial intelligence. The report states that a group of senior executives in the new Meta Superintellness laboratory now focused on developing a closed-Source model. This change is a significant deviation from the traditional meta approach to presenting open-source models.
The Behemoth model, part of the Llama 4 family, introduced in April, is the largest model of the series with two trillion parameters and was described by Meta as one of the most innovative models of artificial intelligence in the world. However, although the training of this model has been completed, its publication has been delayed due to poor performance in internal evaluations. Following the announcement of the Farahush Laboratory last month, the teams responsible for the development of Behemoth stopped experiments related to the model.
The Wall Street Journal has earlier quoted sources as saying that Meta has delayed the release of the model, raising concerns about the company’s multi -billion -dollar investment orientation in the field of artificial intelligence. Meta engineers and researchers expressed concern that the performance of the model was not in line with the claims made about its capabilities. Behemoth was expected to be released late in the year, but recent developments show that this will not happen in the near future.
Technical Challenges Behemoth Model
One of the main reasons for Behemoth’s problems was the use of Chunked Action to improve memory efficiency, according to the Semianalysis Research Company. In the standard attention mechanism, each token has access to all the previous token and understands the full context of the text. But in the piece of attention, the tokens are divided into fixed -size blocks, and each token only has access to the token inside its block. “The implementation of a piece of attention technique in Behemoth, with the aim of increasing efficiency, created blind spots at the blocks of the blocks,” reports. This restriction prevented the model’s ability to follow the reasoning chains, especially when the logic or reasoning extends between the blocks. Also, the meta had no proper infrastructure for evaluating long-confinet evaluations and failed to identify the inefficiency of this method in the early stages.
In addition, the sudden change of routing method in the “Mixture of Experts” architecture in the middle of the training process led to the instability of the specialized model networks and reduced its overall performance.
Educational data problems
Another key challenge was educational data problems. In the midst of Behemoth training, Meta turned to an internal web reptile from public data such as Common Crawl. This change, though considered in the theory of recovery, had a reverse result due to the inability to clear large -scale duplicate data. Toxicity reports that meta processes for data management on this scale were not fully tested.
Organizational challenges and leadership
Toxicity also refers to organizational problems, including the lack of decisive leadership to determine the optimal path of model development and the existence of contrasting research orientations. These issues, along with a lack of scalable evaluation infrastructure, limited the ability to convert initial research into complete education. The Wall Street Journal’s report also points to senior executives’ dissatisfaction with the performance of the LLAMA 4 team and has suggested the possibility of managerial changes in the Meta AI.
Change strategy toward the package model
Meta’s possible decision to stop Behemoth and focus on developing a closed model reflects the company’s strategic change. Meta, who has been admired for many years for the introduction of open source models, is now under pressure to compete with companies such as Openai, Google and Anthropic. Mark Zuckerberg, CEO of Meta, is trying to strengthen the Meta’s position in the competition of artificial intelligence, with the establishment of the Farahush Lab, led by Alexander Wang, former Scale AI CEO, and attracting talents from OpenAI. Zuckerberg has also announced that Meta will invest hundreds of billions of dollars for artificial intelligence infrastructure, including 4.3 graphics processing units (GPUs) by the end of 2.
Challenges related to Behemoth Model Educational Data
According to the Semianalysis Research Company, the LLAMA 4 Behemoth model has faced significant barriers to educational data. In the early stages of development, Meta uses public resources such as Common Crawl to provide educational data. But in the middle of the process, the meta turned to its domestic web creeping, which potentially offers higher quality data. However, this change had averse result.
“Meta has problems in the process of cleansing and deleting duplicate data from the new data flow,” says the report. “These processes had not been tested on a large scale before.” This inefficiency in data management affects the quality of model training.
In addition, meta faced challenges in converting research experiments into full -scale educational processes. The competition between the various research paths and the lack of decisive leadership to determine the optimal development path added to the complexity of the project. “Some model architectural choices were included in the design without accurate assessment,” the report said. “This leads to poor scaling ladders and inefficient management of the development process.”
RCO NEWS



