2025-06-21 Trying to understand the bots

I changed the limit of my automatic ASN ban from 1000 hits in 2h to 500 hits in 2h.

automatic ASN ban

That's because the two biggest autonomous systems hitting my sites are from Vietnam and China and they're currently keeping below that 1000 hits per 2h limit:

The numbers themselves are not that big, but I am annoyed. I live in an English/German world and I don't see a reason for service providers from Vietnam, China, Brazil and Romania crawling my sites.

(You can find all the fish functions I use in the admin directory.)

admin

Let's take a look at what they are requesting!

The Vietnamese bots:

Looks like they're following all the links, so a misbehaved bot, if you ask me. They're "hitting all the buttons" on the web app. The relevant part of `robots.txt`:

The Chinese bots:

Looks like they're following all the links, so a misbehaved bot as well.

Again, the relevant part of `robots.txt`:

The German bots actually make reasonable requests:

Let's see what sort of user agents we see. I'm expecting feed readers.

The one that stands out is "DataForSeoBot". But it seems that this is not a problem. I already have this bot in my Apache config (as seen on 2025-03-21 A summary of my bot defence systems). Still, *booo!* Hetzner for hosting this bot.

2025-03-21 A summary of my bot defence systems

The French bots also seem to be reasonable:

The Brazilian bots seems to download the entire site:

Look at the requests:

Specially those searches at the bottom! The relevant part of `robots.txt`:

Same for the Romanian one:

And what I really hate are those random user agent strings.

Do I really need to go to to the The Ultimate Apache Bad Bot & Referrer Blocker?

The Ultimate Apache Bad Bot & Referrer Blocker

​#ButlerianJihad