
Machine-learning system identifies social media disinformation campaigns
Image credit: Dreamstime
A machine-learning system has been developed by researchers who say it can detect social media posts that are involved in coordinated political influence campaigns such as those undertaken by Russia during the 2016 presidential election.
The team from the American Association for the Advancement of Science (AAAS) said their system works regardless of the platform and is based only on the content of the posts.
Their findings show that content-based features such as a post's word count, webpage links, and posting time can act like a digital fingerprint for such influence campaigns, which could help social media companies, users, or investigators prevent the spread of misinformation and election interference.
The system could make it easier to identify accounts spreading false information – just this week, for example, Twitter suspended 7,000 accounts linked to the far-right QAnon conspiracy theory and last year Facebook deleted accounts from Iran, Russia and other countries, some of which were involved with “coordinated inauthentic behaviour”.
Previous attempts to detect coordinated disinformation efforts have focused on simpler approaches, such as detecting bots or comparing the follower/friendship networks of posters.
However, these approaches are often foiled by posts from human agents or from new accounts, and are often platform-specific.
The researchers hypothesised that large, online political influence operations use a relatively small number of human agents to post large amounts of content quickly, which would tend to make these posts similar in topic, word count, linked articles, and other features.
To test this, they created a machine-learning system trained on datasets of early activity from Russian, Chinese, and Venezuelan influence campaigns on Twitter and Reddit.
They found the system could reliably identify those campaigns' subsequent posts and distinguish them from regular posts by normal users.
The system was less reliable when it was trained on older data and when the campaign in question was more sophisticated, indicating that such a system would not be a comprehensive solution.
The team said that while widespread use of such machine-learning systems could drive bad actors to change their approach and avoid detection, it could also force them to adopt tactics that are more costly or less influential to do so.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.