The Bots That Are Not

By Mike Hearn | The Daily Sceptic | September 10, 2021

Since 2016 automated Twitter accounts have been blamed for Donald Trump and Brexit (many times), Brazilian politics, Venezuelan politics, skepticism of climatology, cannabis misinformation, anti-immigration sentiment, vaping, and, inevitably, distrust of COVID vaccines. News articles about bots are backed by a surprisingly large amount of academic research. Google Scholar alone indexes nearly 10,000 papers on the topic. Some of these papers received widespread coverage:

Unfortunately there’s a problem with this narrative: it is itself misinformation. Bizarrely and ironically, universities are propagating an untrue conspiracy theory while simultaneously claiming to be defending the world from the very same.

The visualization above comes from “The Rise and Fall of Social Bot Research” (also available in talk form). It was quietly uploaded to a preprint server in March by Gallwitz & Kreil, two German investigators, and has received little attention since. Yet their work completely destroys the academic field of bot research to such an extreme extent that it’s possible there are no true scientific papers on the topic at all.

The authors identify a simple problem that crops up in every study they looked at. Unable to directly detect bots because they don’t work for Twitter, academics come up with proxy signals that are asserted to imply automation but which actually don’t. For example, Oxford’s Computational Propaganda Project – responsible for the first paper in the diagram above – defined a bot as any account that tweets more than 50 times per day. That’s a lot of tweeting but easily achieved by heavy users, like the famous journalist Glenn Greenwald, the slightly less famous member of German Parliament Johannes Kahrs – who has in the past managed to rack up an astounding 300 tweets per day – or indeed Donald Trump, who exceeded this threshold on six different days during 2020. Bot papers typically don’t provide examples of the bot accounts they claimed to identify, but in this case four were presented. Of those, three were trivially identifiable as (legitimate) bots because they actually said they were bots in their account metadata, and one was an apparently human account claimed to be a bot with no evidence. On this basis the authors generated 27 news stories and 323 citations, although the paper was never peer reviewed.

In 2017 I investigated the Berkley/Swansea paper and found that it was doing something very similar, but using an even laxer definition. Any account that regularly tweeted more than five times after midnight from a smartphone was classed as a bot. Obviously, this is not a valid way to detect automation. Despite being built on nonsensical premises, invalid modelling, mis-characterisations of its own data and once again not being peer reviewed, the authors were able to successfully influence the British Parliament. Damian Collins, the Tory MP who chaired the DCMS Select Committee at the time, said: “This is the most significant evidence yet of interference by Russian-backed social media accounts around the Brexit referendum. The content published and promoted by these accounts is clearly designed to increase tensions throughout the country and undermine our democratic process. I fear that this may well be just the tip of the iceberg.”

But since 2019 the vast majority of papers about social bots rely on a machine learning model called ‘Botometer’. The Botometer is available online and claims to measure the probability of any Twitter account being a bot. Created by a pair of academics in the USA, it has been cited nearly 700 times and generates a continual stream of news stories. The model is frequently described as a “state of the art bot detection method” with “95% accuracy”.

That claim is false. The Botometer’s false positive rate is so high it is practically a random number generator. A simple demonstration of the problem was the distribution of scores given to verified members of U.S. Congress:

In experiments run by Gallwitz & Kreil, nearly half of Congress were classified as more likely to be bots than human, along with 12% of Nobel Prize laureates, 17% of Reuters journalists, 21.9% of the staff members of UN Women and – inevitably – U.S. President Joe Biden.

But detecting the false positive problem did not require compiling lists of verified humans. One study that claimed to identify around 190,000 bots included the following accounts in its set:

Taken from a dataset shared by Dunn et al.

The developers of the Botometer know it doesn’t work. After the embarrassing U.S. Congress data was published, an appropriate response would have been retraction of their paper. But that would have implied that all the papers that relied upon it should also be retracted. Instead they hard-coded the model to know that Congress are human and then went on the attack, describing their critics as “academic trolls”:

Root cause analysis

This story is a specific instance of a general problem that crops up frequently in bad science. Academics decide a question is important and needs to be investigated, but they don’t have sufficiently good data to draw accurate conclusions. Because there are no incentives to recognize that and abandon the line of inquiry, they proceed regardless and make claims that end up being drastically wrong. Anyone from outside the field who points out what’s happening is simply ignored, or attacked as “not an expert” and thus inherently illegitimate.

Although no actual expertise is required to spot the problems in this case, I can nonetheless criticize their work with confidence because I actually am an expert in fighting bots. As a senior software engineer at Google I initiated and designed one of their most successful bot detection platforms. Today it checks over a million actions per second for malicious automation across the Google network. A version of it was eventually made available to all websites for free as part of the ReCAPTCHA system, providing an alternative to the distorted word puzzles you may remember from the earlier days of the internet. Those often frustrating puzzles were slowly replaced in recent years by simply clicking a box that says “I’m not a bot”. The latest versions go even further and can detect bots whilst remaining entirely invisible.

Exactly how this platform works is a Google trade secret, but when spammers discuss ideas for beating it they are well aware that it doesn’t use the sort of techniques academics do. Despite the frequent claim that Botometer is “state of the art”, in reality it is primitive. Genuinely state-of-the-art bot detectors use a correct definition of bot based on how actions are being performed. Spammers are forced to execute polymorphic encrypted programs that detect signs of automation at the protocol and API level. It’s a battle between programmers, and how it works wouldn’t be easily explainable to social scientists.

Spam fighters at Twitter have an equally low opinion of this research. They noted in 2020 that tools like Botometer use “an extremely limited approach” and “do not account for common Twitter use cases”. “Binary judgments of who’s a “bot or not” have real potential to poison our public discourse – particularly when they are pushed out through the media …. the narrative on what’s actually going on is increasingly behind the curve.”

Many fields cannot benefit from academic research because academics cannot obain sufficiently good data with which to draw conclusions. Unfortunately, they sometimes have difficulty accepting that. When I ended my 2017 investigation of the Berkeley/Swansea paper by observing that social scientists can’t usefully contribute to fighting bots, an academic posted a comment calling it “a Trumpian statement” and argued that tech firms should release everyone’s private account data to academics, due to their capacity for “more altruistic” insights. Yet their self-proclaimed insights are usually far from altruistic. The ugly truth is that social bot research is primarily a work of ideological propaganda. Many bot papers use the supposed prevalence of non-existent bots to argue for censorship and control of the internet. Too many people disagree with common academic beliefs. If only social media were edited by the most altruistic and insightful members of society, they reason, nobody would ever disagree with them again.

September 10, 2021 - Posted by aletho | Deception, Full Spectrum Dominance, Progressive Hypocrite, Science and Pseudo-Science, Timeless or most popular

No comments yet.

American Aerial Massacres in Germany

Aletho News Videos

or go to

Aletho News Archives – Video-Images

From the Archives

“Democratic Institutions?” – 10 Lessons from history that will destroy your trust in the CIA

By Kit | OffGuardian | July 20, 2018

… At every corner, we are urged to simply believe what we are told. Whether it is about believing Porton Down and MI6 about “novichok”, or believing the White Helmets about Sarin, or believing the FBI about “collusion”, we are presented with no facts, just assertions from authority. Those who question those assertions are deemed “bots” at best or “traitors” at worst.

Well here, fellow traitors, are the Top Ten reasons to question anything and everything the CIA – or any intelligence agency – has ever told you. … Read full article

Blog Roll

Subscribe to Aletho News

Enter your email address to receive new posts by email.

Email Address:

Join 2,450 other subscribers
Visits Since December 2009
- 7,559,772 hits
Looking for something?

Search for:
Archives
Archives
Calendar

September 2021

M T W T F S S

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30

« Aug Oct »
Categories

Aletho News Civil Liberties Corruption Deception Economics Environmentalism Ethnic Cleansing, Racism, Zionism Fake News False Flag Terrorism Full Spectrum Dominance Illegal Occupation Mainstream Media, Warmongering Malthusian Ideology, Phony Scarcity Militarism Progressive Hypocrite Russophobia Science and Pseudo-Science Solidarity and Activism Subjugation - Torture Supremacism, Social Darwinism Timeless or most popular Video War Crimes Wars for Israel
Tags
Afghanistan Africa AIPAC al-Qaeda Australia BBC Benjamin Netanyahu Brazil Canada CDC Central Intelligence Agency China CIA CNN Covid-19 COVID-19 Vaccine Donald Trump Egypt European Union Facebook FBI FDA France Gaza Germany Google Hamas Hebron Hezbollah Hillary Clinton Human rights Hungary India Iran Iraq ISIS Israel Israeli settlement Japan Jerusalem Joe Biden Korea Latin America Lebanon Libya Middle East National Security Agency NATO New York Times North Korea NSA Obama Pakistan Palestine Poland Qatar Russia Sanctions against Iran Saudi Arabia Syria The Guardian Turkey Twitter UAE UK Ukraine United Nations United States USA Venezuela Washington Post West Bank WHO Yemen Zionism

September 2021
M	T	W	T	F	S	S
	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Aletho News
If Americans Knew
No Tricks Zone
Contact:

atheonews (at) gmail.com
Disclaimer

This site is provided as a research and reference tool. Although we make every reasonable effort to ensure that the information and data provided at this site are useful, accurate, and current, we cannot guarantee that the information and data provided here will be error-free. By using this site, you assume all responsibility for and risk arising from your use of and reliance upon the contents of this site.

This site and the information available through it do not, and are not intended to constitute legal advice. Should you require legal advice, you should consult your own attorney.

Nothing within this site or linked to by this site constitutes investment advice or medical advice.

Materials accessible from or added to this site by third parties, such as comments posted, are strictly the responsibility of the third party who added such materials or made them accessible and we neither endorse nor undertake to control, monitor, edit or assume responsibility for any such third-party material.

The posting of stories, commentaries, reports, documents and links (embedded or otherwise) on this site does not in any way, shape or form, implied or otherwise, necessarily express or suggest endorsement or support of any of such posted material or parts therein.

The word “alleged” is deemed to occur before the word “fraud.” Since the rule of law still applies. To peasants, at least.

Fair Use

This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a ‘fair use’ of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more info go to: http://www.law.cornell.edu/uscode/17/107.shtml. If you wish to use copyrighted material from this site for purposes of your own that go beyond ‘fair use’, you must obtain permission from the copyright owner.

DMCA Contact

This is information for anyone that wishes to challenge our “fair use” of copyrighted material.

If you are a legal copyright holder or a designated agent for such and you believe that content residing on or accessible through our website infringes a copyright and falls outside the boundaries of “Fair Use”, please send a notice of infringement by contacting atheonews@gmail.com.

We will respond and take necessary action immediately.

If notice is given of an alleged copyright violation we will act expeditiously to remove or disable access to the material(s) in question.

All 3rd party material posted on this website is copyright the respective owners / authors. Aletho News makes no claim of copyright on such material.

Aletho News

ΑΛΗΘΩΣ