In a pioneering study conducted by researchers at the University of California San Diego, Openai’s Gpt-4.5 language model was able to persuade participas in 73 % of cases. It is a remarkable conclusion that artificial ielligence has achieved an unprecedeed level of imitating human behavior.
Designed in 1950 by Alan Turing, the Turing test is a classic criterion for measuring the ability of machines to imitate human ielligence. In this updated version of the test, the participas talked to a human and an artificial ielligence model at the same time, and then they had to know which car was.
The ieresting thing was that when the Gpt-4.5 was asked to imitate a particular character (such as a young man ierested in culture and the Iernet), he even acted more than real human beings. In corast, the conveional version of the Gpt-4O, which lacked this personality capability, did only 21 % of cases.
Cameron Jones, a senior researcher at the study, believes that these results show that large language models can be fully replaced in short ieractions without being recognizable. Although technically significa, it has serious warnings about its social consequences, including the possibility of job replaceme and abuse in social engineering attacks.
However, many experts believe that the Turing test, despite its popularity, is not a complete criterion for measuring real ielligence. Google’s engineer François Schole emphasizes that the test is more of a iellectual experience than a practical criterion for measuring machine ielligence.
With the increasing advanceme of artificial ielligence technologies, it seems that the scieific community needs to develop more comprehensive and comprehensive criteria to evaluate the cognitive capabilities of machines. This study not only shows the remarkable abilities of new language models, but also raises importa questions about the nature of human and machine ielligence and relationship in the future.




