According to a report published on July 5, 2009 by the New York Times, Meta plans to stop the Llama 4 Behemoth project, the most advanced and largest model of artificial ielligence. The report states that a group of senior executives in the new Meta Superiellness laboratory now focused on developing a closed-Source model. This change is a significa deviation from the traditional meta approach to preseing open-source models.
The Behemoth model, part of the Llama 4 family, iroduced in April, is the largest model of the series with two trillion parameters and was described by Meta as one of the most innovative models of artificial ielligence in the world. However, although the training of this model has been completed, its publication has been delayed due to poor performance in iernal evaluations. Following the announceme of the Farahush Laboratory last moh, the teams responsible for the developme of Behemoth stopped experimes related to the model.
The Wall Street Journal has earlier quoted sources as saying that Meta has delayed the release of the model, raising concerns about the company’s multi -billion -dollar investme orieation in the field of artificial ielligence. Meta engineers and researchers expressed concern that the performance of the model was not in line with the claims made about its capabilities. Behemoth was expected to be released late in the year, but rece developmes show that this will not happen in the near future.
Technical Challenges Behemoth Model
One of the main reasons for Behemoth’s problems was the use of Chunked Action to improve memory efficiency, according to the Semianalysis Research Company. In the standard atteion mechanism, each token has access to all the previous token and understands the full coext of the text. But in the piece of atteion, the tokens are divided io fixed -size blocks, and each token only has access to the token inside its block. “The implemeation of a piece of atteion technique in Behemoth, with the aim of increasing efficiency, created blind spots at the blocks of the blocks,” reports. This restriction preveed the model’s ability to follow the reasoning chains, especially when the logic or reasoning extends between the blocks. Also, the meta had no proper infrastructure for evaluating long-confinet evaluations and failed to ideify the inefficiency of this method in the early stages.
In addition, the sudden change of routing method in the “Mixture of Experts” architecture in the middle of the training process led to the instability of the specialized model networks and reduced its overall performance.
Educational data problems
Another key challenge was educational data problems. In the midst of Behemoth training, Meta turned to an iernal web reptile from public data such as Common Crawl. This change, though considered in the theory of recovery, had a reverse result due to the inability to clear large -scale duplicate data. Toxicity reports that meta processes for data manageme on this scale were not fully tested.
Organizational challenges and leadership
Toxicity also refers to organizational problems, including the lack of decisive leadership to determine the optimal path of model developme and the existence of corasting research orieations. These issues, along with a lack of scalable evaluation infrastructure, limited the ability to convert initial research io complete education. The Wall Street Journal’s report also pois to senior executives’ dissatisfaction with the performance of the LLAMA 4 team and has suggested the possibility of managerial changes in the Meta AI.
Change strategy toward the package model
Meta’s possible decision to stop Behemoth and focus on developing a closed model reflects the company’s strategic change. Meta, who has been admired for many years for the iroduction of open source models, is now under pressure to compete with companies such as Openai, Google and Ahropic. Mark Zuckerberg, CEO of Meta, is trying to strengthen the Meta’s position in the competition of artificial ielligence, with the establishme of the Farahush Lab, led by Alexander Wang, former Scale AI CEO, and attracting tales from OpenAI. Zuckerberg has also announced that Meta will invest hundreds of billions of dollars for artificial ielligence infrastructure, including 4.3 graphics processing units (GPUs) by the end of 2.
Challenges related to Behemoth Model Educational Data
According to the Semianalysis Research Company, the LLAMA 4 Behemoth model has faced significa barriers to educational data. In the early stages of developme, Meta uses public resources such as Common Crawl to provide educational data. But in the middle of the process, the meta turned to its domestic web creeping, which poteially offers higher quality data. However, this change had averse result.
“Meta has problems in the process of cleansing and deleting duplicate data from the new data flow,” says the report. “These processes had not been tested on a large scale before.” This inefficiency in data manageme affects the quality of model training.
In addition, meta faced challenges in converting research experimes io full -scale educational processes. The competition between the various research paths and the lack of decisive leadership to determine the optimal developme path added to the complexity of the project. “Some model architectural choices were included in the design without accurate assessme,” the report said. “This leads to poor scaling ladders and inefficie manageme of the developme process.”




