So why are people arguing about the percentage of bot accounts on Twitter?
As the creator of botometera widely used bot detection tool, our group at Indiana University Social Media Observatory Has been investigating inauthentic social media accounts and manipulation for over a decade. We brought the concept of “social bot“ in the foreground and first appreciated their spread 2017 on Twitter.
Based on our knowledge and experience, we believe that estimating the percentage of bots on Twitter has become a very difficult task, and a discussion about the accuracy of the estimate might miss the point. Here’s why.
What exactly is a bot?
In order to measure the prevalence of problematic accounts on Twitter, a clear definition of the goals is necessary. Common terms like “fake accounts”, “spam accounts” and “bots” are used interchangeably but have different meanings.
Fake or fake accounts are those that impersonate people. Accounts that mass-produce unsolicited advertising content are defined as spammers. Bots, on the other hand, are accounts that are partially controlled by software; they can automatically post content or perform simple interactions like retweeting.
These types of accounts often overlap. For example, you can create a bot that impersonates a human to automatically post spam. Such an account is simultaneously a bot, a spammer and a fake. But not every fake account is a bot or a spammer and vice versa. Making an estimate without a clear definition will only lead to misleading results.
Defining and distinguishing account types can also inform the right interventions.
Fake and spam accounts affect and hurt the online environment platform policy. Malicious bots are used to it spread misinformation, inflate popularity, Exacerbate conflicts with negative and inflammatory content, manipulate opinions, affect elections, commit financial fraud and disrupt communication.
Simply banning all bots is not in the best interests of social media users.
For the sake of simplicity, researchers use the term “unauthentic accounts” to refer to the collection of fake accounts, spammers, and malicious bots. This definition seems to be used by Twitter as well. However, it’s unclear what Musk has in mind.
hard to count
Even if consensus is reached on a definition, there are still technical challenges in estimating prevalence.
External researchers do not have access to the same data as Twitter, such as IP addresses and phone numbers. This impedes the ability of the public to identify inauthentic accounts. But Twitter also admits that the actual number of inauthentic accounts could be higher than estimatedbecause Detection is a challenge.
Unauthentic accounts evolve and develop new tactics to avoid detection. For example some fake accounts Use AI generated faces as profiles. These faces can be indistinguishable from real ones, even for humans. Identifying such accounts is difficult and requires new technologies.
Another difficulty arises from reconciled accounts which individually appear normal, but behave so similarly that they are almost certainly controlled by a single entity. And yet they are like needles in a haystack of hundreds of millions of daily tweets.
The distinction between fake and real accounts is becoming increasingly blurred. Accounts can be hacked bought or rentedand some users “donate” their credentials to organizations who post on their behalf. As a result, so-called “Cyborg” accounts are driven by both algorithms and humans. Similarly, spammers sometimes post legitimate content to disguise their activities.
We have observed a wide range of behaviors that mix bot and human characteristics. Estimating the prevalence of inauthentic accounts requires the application of a simplified binary classification: authentic or inauthentic account. No matter where the line is drawn, mistakes are inevitable.
The big picture is missing
The recent debate’s focus on estimating the number of Twitter bots oversimplifies the problem and misses the point of quantifying the harm of online abuse and manipulation by inauthentic accounts.
Through BotAmp, a new tool in the Botometer family that anyone with a Twitter account can use, we found that the presence of automated activities was not evenly distributed. For example, discussion of cryptocurrencies tends to show more bot activity than discussion of cats. Therefore, whether the overall prevalence is 5% or 20% makes little difference to individual users; Your experience with these accounts depends on who you follow and what topics interest you.
Recent evidence suggests that inauthentic accounts may not be the only culprits in the spread of misinformation, hate speech, polarization and radicalization. These issues usually affect many human users. Our analysis shows this, for example Misinformation about COVID-19 has been openly disseminated both on Twitter and Facebook FB,
verified by, high profile accounts.
Even if it were possible to accurately estimate the prevalence of fake accounts, it would do little to solve these problems. A useful first step would be to recognize the complex nature of these problems. This will help social media platforms and policymakers to develop meaningful responses.
Kai-Cheng Yang is a graduate student in computer science at Indiana University. Filippo Menczer is Professor of Computer Science and Computer Science at Indiana University.
This comment was originally published by The Conversation—How many bots are on Twitter? The question is difficult to answer and misses the point
More about Twitter
Twitter’s stock falls below where Elon Musk bought it, turning a $1.1 billion gain into a $40 million loss
The new Twitter policy aims to cut through the fog of wartime misinformation
Chances are slim that Elon Musk will make Twitter a more profitable business
https://www.marketwatch.com/story/how-many-bots-are-on-twitter-the-question-is-hard-to-answerand-misses-thepoint-anyway-11653428929?rss=1&siteid=rss Opinion: How many bots are on Twitter? The question is difficult to answer – and misses the point anyway