Researchers’ new research shows that large language models mainly change their behavior when examining. When answering questions designed to measure personality traits, these models usually give answers that are socially desirable to make them look lovely.
Johannes Eichstaedt, an assistant professor at Stanford University, who led the study, says his group was interested in examining psychology techniques after realizing that large language models were cruel after long -term conversations. The results of this study are published in PNAS.
Chats give responses that are socially desirable
His research team and his research team tested the 5 personality traits commonly used in psychology on the Claude 3, Gpt-4 and LLAMA 3 models. These five properties are experience or imagination, conscientiousness, extroversion, agreement, and psychiatry.
The researchers found that when they are said to be large language models, personality tests adjust their answers. When this is not clearly told, these models also respond to more extroversion and greater agreement and less psychoanalysis.
This behavior of large language models is exactly the same as the behavior of some humans who change their answers to make it more loving, but it seems that this behavior is more severe in artificial intelligence models. Adesh Salcha, a data scientist in Stanford, says that the percentage of extroversion in artificial intelligence models sometimes reaches from 2 % to 5 %.
The fact that artificial intelligence models can change their behavior when they find that they are under personality testing, have consequences for artificial intelligence safety; Because it proves that these models can take a dual behavior.
RCO NEWS



