A new study shows that the ability of artificial intelligence (AI) to perform long -term tasks, such as multi -hour projects, doubles every seven months, and it may perform software development alone by one year.
The study, conducted by Model Evaluation & Threat Research (Metr), is measured by introducing the criterion of “time to work” based on the time it takes to complete tasks compared to humans.
The results show that artificial intelligence in short -term tasks (less than four minutes) is approximately 2 % success, but in more than four hours, its success is reduced to 5 %, although newer models perform better than older models.
This study examined tasks from simple search on Wikipedia to complex projects such as writing CUDA kernels or bug fixes in PyTorch by experimenting with models such as Sonnet 4.3, GPT-2, and Claude Opus.
Hcast and Re-Bench test tools, which include software tasks and research projects in the field of machine learning and cyber security, have shown that artificial intelligence progresses at complex and multi-stage tasks.
Researchers believe that this process can lead to complete automation of some jobs by year 2, but there are still challenges such as coordination in long tasks and the complexities of the real world.
Experts, including Elnour Watson of the University of Syngulators, consider this criterion valuable because it measures the ability of artificial intelligence to maintain coherent performance over time.
Watson predicts that by year 3, the all -encompassing artificial intelligence will emerge and will be able to manage a variety of tasks over long periods, such as travel planning or financial management.
This progress can increase business productivity, reduce costs, and allow people to focus on creative work, but at the same time have raised concerns about replacing human jobs and its social impacts.
RCO NEWS




