Status
Not open for further replies.
This suggestion has been closed. Votes are no longer accepted.
I'm not entirely sure what you mean; are you requesting the ability to ban a certain spider (such as Yandex), or ban all spiders from accessing a certain page?
 
Yandex and other aggressive spiders do not honor robots.txt

We used to have mod code that blocks those spiders, but we moved much of that code to the Apache2 configuration.

In my view, i's not really a "core SEO" functionality as much as it is a "manage user agents" application; and there is a lot that can be done with managing user agents:-

-- banning user agents
-- redirecting user agents
-- triggering mobile apps based on user agents
-- hiding content from certain user agents
-- banning certain pages based on user agent
-- restricting certain forum features from user agents
-- statistics based on user agents
-- displaying user agents on the site (all, not only bots)
-- managing user agents in the whoisonline bot display)

etc.
 
Last edited:
We have no current plans to add this functionality because it would be better served by modifying .htaccess files or Apache configuration.

Here's an example:
Code:
SetEnvIfNoCase User-Agent "^Yandex*" bad_bot
SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
SetEnvIfNoCase User-Agent .*exabot.* bad_bot
SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
SetEnvIfNoCase User-Agent .*semrushbot.* bad_bot

<Limit GET POST HEAD>
	Order Allow,Deny
	Allow from all
	Deny from env=bad_bot
</Limit>
 
Status
Not open for further replies.
Top