nginx禁止垃圾蜘蛛访问
现在很多ai蜘蛛,大数据蜘蛛,天天偷偷抓取网站信息,不给流量也罢。
关键是没底线:来了就不走,一直反复抓取,浪费大量服务器资源,如果您有需要,立刻屏蔽吧:
nginx配置文件里加入include zzzyk_deny.conf;
server
{
include zzzyk_deny.conf;
listen 80;
if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) {
return 403;
}
#禁止指定UA及UA为空的访问
if ($http_user_agent ~ "opensiteexplorer|BLEXBot|MauiBot|SemrushBot|DotBot|WinHttp|WebZIP|FetchURL|node-superagent|java/|yisouspider|FeedDemon|Jullo|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|Java|Feedly|Apache-HttpAsyncClient|UniversalFeedParser|ApacheBench|M
icrosoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|BOT/0.1|YandexBot|FlightDeckReports|Linguee Bot|^$" ) {return 403;}
#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {
return 403;
}
}