RSS

Gena Makhomed gmm at csdoc.com
Mon Jul 9 08:27:58 UTC 2018


On 09.07.2018 10:41, Sergey Budnevitch wrote:

>> https://hg.nginx.org/nginx/atom-log
>> https://hg.nginx.org/pkg-oss/atom-log
>> https://hg.nginx.org/nginx.org/atom-log
>>
>> stop working, instead of RSS feed html+javascript returned.
>> RSSOwl can't execute javascript and don't see these RSS feeds.
>>
>> Also I can't download http://hg.nginx.org/nginx/archive/tip.tar.gz
>> with curl - also html+javascript returned, which curl don't understand.
>>
>> can you please fix these issues?

> atom-tags, atom-log, archive were removed from bot mitigation mechanism

thank you! but issue not fixed for these urls:

http://hg.nginx.org/pkg-oss/atom-log
https://hg.nginx.org/pkg-oss/atom-log

http://hg.nginx.org/nginx.org/atom-log
https://hg.nginx.org/nginx.org/atom-log

- they still return html+javascript.

P.S.

bot mitigation mechanism written as custom "C" module
or written using only njs, without 3rd party modules?

it is closed source code or will be open source code?

bot mitigation mechanism via javascript good for bots,
but it also disables site indexing by search engines.(?)

it will be good to whitelist search engines by DNS,
at least main search engines: Google, Yandex and Bing.

https://support.google.com/webmasters/answer/80553?hl=en

https://yandex.com/support/webmaster/robot-workings/check-yandex-robots.xml

https://www.bing.com/webmaster/help/how-to-verify-bingbot-3905dc26

But even when search engines bots are whitelisted - this can be
detected by search engines as https://en.wikipedia.org/wiki/Cloaking

Partial solution is to display html+javascript only for bots, which
are previously detected as bots by other bot mitigation mechanisms.

P.P.S.

I write and use other bot mitigation mechanism with nginx:

https://github.com/makhomed/autofilter

-- 
Best regards,
  Gena


More information about the nginx-devel mailing list