We’ve all experienced that creeping doubt that the writing we’re reading might have been produced by artificial intelligence; But this is surprisingly difficult to determine definitively.
Last year, many people claimed that certain words could be signs of AI-generated text, but the evidence for this claim is scant, and as language models become more complex, it has become more difficult to trace these words as reliable signs.
However, Wikipedia members seem to have found a very accurate way of identifying AI-generated writing, and the group’s general guide, Signs of AI writing, is the best resource.
Since 2023, Wikipedia editors have tried to better manage the articles generated by artificial intelligence; A project called Project AI Cleanup. With millions of edits per day, there is a huge amount of data to sift through, and in the long-standing tradition of Wikipedia editors, the output of these efforts is a general guide that is both rich in detail and evidence-based.
At the outset, this guide confirms what we already knew: automated tools are pretty much useless in this area. Instead, the guide focuses on writing patterns and structures that are rare in Wikipedia but common across the Internet (and thus common in the models’ training data).
According to the guide, AI-generated texts spend a lot of time highlighting the importance of a topic, usually in vague, general terms. Also, AI models devote a large part of the text to explaining the details of minor media presences to make the issue appear more important and prominent; An approach more likely to be expected from a personal biography than from an independent source.
The guide also mentions a very interesting point about sentence final structures; where models usually make vague claims about the subject. Grammar enthusiasts call this structure Present Participle. It is a bit difficult to recognize this pattern, but once you know it, it will be visible everywhere in the texts.
There is also a tendency on the part of AI models to use vague advertising language; A language that is very common throughout the internet. These models describe landscapes as always beautiful, landscapes as always breathtaking, and generally everything as clean and modern. As the editors of Wikipedia say, such prose is more reminiscent of the text of a TV commercial than an encyclopedia.
It is worth reading this guide in its entirety. Until now, it was thought that the prose style of language models evolves so rapidly that no fixed and reliable features can be determined to identify it. But the writing features presented in this guide are deeply rooted in how AI models are trained and deployed. Although they can be hidden, it will be very difficult to completely get rid of them. And if people become more adept at recognizing AI writing, this could have far-reaching and interesting implications.
RCO NEWS




