Many experts in the field of artificial ielligence, such as “Ilya Satskiver”, one of the founders of OpenAI, say that the process of training artificial ielligence with old methods has reached its peak and more powerful models can no longer be developed with them. Now researchers at Google DeepMind say the outputs of “reasoning” models like o1 can be used as new AI training data sources.
According to a Business Insider report, all the useful data available on the Iernet has been used to train artificial ielligence models. This process, known as pre-training, created many of the rece achievemes of artificial ielligence, including ChatGPT, but recely the advancemes in artificial ielligence have lost their momeum, and experts say that the pre-training period is nearing its end.

With big tech companies investing trillions of dollars in the technology, slowing down the progress of AI models can be dauing, but researchers say there’s a new way to train and develop AI models.
DeepMind researchers’ new method for training artificial ielligence
New models such as o1 and o3 from OpenAI use a new method to respond to user requests, which is called test-time or inference-time compute.
In this method, the artificial ielligence divides your requests io smaller parts and turns each one io a new prompt. Each stage requires the execution of a new request, which is known as the inference stage in artificial ielligence. This creates a chain of argumes in which each part of the problem is solved. The model does not go to the next step uil it solves each part and can finally provide a better final answer.

According to published benchmarks, the new models often produce better outputs than the previous models, especially on math questions. The researchers say that these high-quality outputs can be the new training data; In other words, this huge new information can be injected io the training process of other artificial ielligence models to create an iterative self-improveme loop.
For example, if o1 model outputs are better than GPT-4, these new outputs can be used to train future AI models, or say o1 scores 90% on a particular AI benchmark, you can aggregate these responses and Give GPT-4 and that model will also reach 90% score.
Of course, it seems that some companies are using this method to develop their models. Researchers believe that this syhetic data is better than what is available on the Iernet.



