We’ve all experienced that creeping doubt that the writing we’re reading might have been produced by artificial ielligence; But this is surprisingly difficult to determine definitively.
Last year, many people claimed that certain words could be signs of AI-generated text, but the evidence for this claim is sca, and as language models become more complex, it has become more difficult to trace these words as reliable signs.
However, Wikipedia members seem to have found a very accurate way of ideifying AI-generated writing, and the group’s general guide, Signs of AI writing, is the best resource.
Since 2023, Wikipedia editors have tried to better manage the articles generated by artificial ielligence; A project called Project AI Cleanup. With millions of edits per day, there is a huge amou of data to sift through, and in the long-standing tradition of Wikipedia editors, the output of these efforts is a general guide that is both rich in detail and evidence-based.
At the outset, this guide confirms what we already knew: automated tools are pretty much useless in this area. Instead, the guide focuses on writing patterns and structures that are rare in Wikipedia but common across the Iernet (and thus common in the models’ training data).
According to the guide, AI-generated texts spend a lot of time highlighting the importance of a topic, usually in vague, general terms. Also, AI models devote a large part of the text to explaining the details of minor media presences to make the issue appear more importa and promine; An approach more likely to be expected from a personal biography than from an independe source.
The guide also meions a very ieresting poi about seence final structures; where models usually make vague claims about the subject. Grammar ehusiasts call this structure Prese Participle. It is a bit difficult to recognize this pattern, but once you know it, it will be visible everywhere in the texts.
There is also a tendency on the part of AI models to use vague advertising language; A language that is very common throughout the iernet. These models describe landscapes as always beautiful, landscapes as always breathtaking, and generally everything as clean and modern. As the editors of Wikipedia say, such prose is more reminisce of the text of a TV commercial than an encyclopedia.
It is worth reading this guide in its eirety. Uil now, it was thought that the prose style of language models evolves so rapidly that no fixed and reliable features can be determined to ideify it. But the writing features preseed in this guide are deeply rooted in how AI models are trained and deployed. Although they can be hidden, it will be very difficult to completely get rid of them. And if people become more adept at recognizing AI writing, this could have far-reaching and ieresting implications.




