Open source developers battle AI crawlers with cunning and determination

Investors Exhibit Favorable Sentiment Towards Lovable

Investors are showing strong interest in Swedish vibe-coding startup Lovable, with unsolicited offers valuing the company at over $4 billion, Read more

Instagram introduces new feature allowing users to speed up reels, mirroring the popular trend seen on TikTok.

Meta-owned Instagram is once again taking inspiration from TikTok by introducing a new feature. This time, users can now play Read more

North Korean Hackers Pose as Job Seekers in US Companies to Steal Millions

The U.S. Treasury has uncovered a sophisticated fraud network used by North Korea to target American companies with hackers disguised Read more

Waze says goodbye to Google Assistant on iOS after ongoing issues

Since 2020, iPhone users have been dealing with problems integrating Google Assistant with Waze on iOS. The company has been Read more

AI web-crawling bots are the cockroaches of the internet, many software developers believe. Some devs have started fighting back in ingenuous, often humorous ways.

While any website might be targeted by bad crawler behavior — sometimes taking down the site — open source developers are “disproportionately” impacted, writes Niccolò Venerandi, developer of a Linux desktop known as Plasma and owner of the blog LibreNews.

By their nature, sites hosting free and open source (FOSS) projects share more of their infrastructure publicly, and they also tend to have fewer resources than commercial products.

The issue is that many AI bots don’t honor the Robots Exclusion Protocol robot.txt file, the tool that tells bots what not to crawl, originally created for search engine bots.

See also  Google Vids Introduces AI Avatars and Free Consumer Version

In a “cry for help” blog post in January, FOSS developer Xe Iaso described how AmazonBot relentlessly pounded on a Git server website to the point of causing DDoS outages. Git servers host FOSS projects so that anyone who wants can download the code or contribute to it.

But this bot ignored Iaso’s robot.txt, hid behind other IP addresses, and pretended to be other users, Iaso said.

“It’s futile to block AI crawler bots because they lie, change their user agent, use residential IP addresses as proxies, and more,” Iaso lamented.

Tech and VC heavyweights join the Disrupt 2025 agenda

“They will scrape your site until it falls over, and then they will scrape it some more. They will click every link on every link on every link, viewing the same pages over and over and over and over. Some of them will even click on the same link multiple times in the same second,” the developer wrote in the post.

Enter the god of graves

So Iaso fought back with cleverness, building a tool called Anubis.

Anubis is a reverse proxy proof-of-work check that must be passed before requests are allowed to hit a Git server. It blocks bots but lets through browsers operated by humans.

The funny part: Anubis is the name of a god in Egyptian mythology who leads the dead to judgment.

“Anubis weighed your soul (heart) and if it was heavier than a feather, your heart got eaten and you, like, mega died,” Iaso told TechCrunch. If a web request passes the challenge and is determined to be human, a cute anime picture announces success. The drawing is “my take on anthropomorphizing Anubis,” says Iaso. If it’s a bot, the request gets denied.

See also  TikTok Shop Expansion to France, Germany, and Italy

The wryly named project has spread like the wind among the FOSS community. Iaso shared it on GitHub on March 19, and in just a few days, it collected 2,000 stars, 20 contributors, and 39 forks.

Vengeance as defense

The instant popularity of Anubis shows that Iaso’s pain is not unique. In fact, Venerandi shared story after story:

– SourceHut’s Drew DeVault described spending “from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale,” and “experiencing dozens of brief outages per week.”
– Jonathan Corbet, a famed FOSS developer who runs LWN, warned that his site was being slowed by DDoS-level traffic “from AI scraper bots.”
– Kevin Fenzi, the sysadmin of the Linux Fedora project, had to block the entire country of Brazil from access due to aggressive AI scraper bots.

Venerandi tells TechCrunch that he knows of multiple other projects experiencing the same issues. One of them “had to temporarily ban all Chinese IP addresses at one point.”

Let that sink in for a moment — that developers “even have to turn to banning entire countries” just to fend off AI bots that ignore robot.txt files, says Venerandi.

Beyond weighing the soul of a web requester, other devs believe vengeance is the best defense.

A few days ago on Hacker News, user xyzal suggested loading robot.txt forbidden pages with “a bucket load of articles on the benefits of drinking bleach” or “articles about positive effect of catching measles on performance in bed.”

“Think we need to aim for the bots to get _negative_ utility value from visiting our traps, not just zero value,” xyzal explained.
In January, an anonymous creator named “Aaron” released a tool called Nepenthes that aims to trap AI crawlers in a maze of fake content. The dev admitted to Ars Technica that the aggressive goal might be considered malicious, and the tool is named after a carnivorous plant.

See also  'Tesla Takedown' Protesters Organize Global Day of Action on March 29: Potential for Escalation

Cloudflare, a major player in offering tools to combat AI crawlers, recently released a similar tool called AI Labyrinth. This tool is designed to confuse and waste the resources of AI crawlers and other bots that ignore ‘no crawl’ directives by feeding them irrelevant content.

SourceHut’s DeVault expressed that while Nepenthes has a satisfying sense of justice in feeding nonsense to crawlers, the solution that worked for his site was ultimately Anubis.

In a heartfelt plea, DeVault urged developers to stop legitimizing LLMs, AI image generators, GitHub Copilot, and similar tools. Despite the unlikely halt in their usage, developers, particularly in FOSS, are fighting back with cleverness and humor to combat these technologies.

iOS 26 beta 3 tones down Liquid Glass

Venture Debt Lenders Set to Shake Up Startup Scene in 2022, Experts Predict