Proxy pass set body on if
    sanflores 
    nginx-forum at forum.nginx.org
       
    Mon Feb 22 19:15:42 UTC 2021
    
    
  
First of all, thanks for your help.
Here is my configuration:
cat nginx.conf
-----------------------------------------------------------------------------------------------------
worker_processes  auto;
error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;
events {
    worker_connections  1024;
}
http {
    default_type  application/octet-stream;
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request"
'
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    access_log  /var/log/nginx/access.log  main;
    sendfile        on;
    keepalive_timeout  65;
    gzip on;
    gzip_types application/javascript;
    gzip_buffers 32 8k;
    map $http_user_agent $limit_bots {
      default 0;
      ~*(google|bing|yandex|msnbot) 1;
     
~*(AltaVista|Googlebot|Slurp|BlackWidow|Bot|ChinaClaw|Custo|DISCo|Download|Demon|eCatch|EirGrabber|EmailSiphon|EmailWolf|SuperHTTP|Surfbot|WebWhacker)
1;
     
~*(Express|WebPictures|ExtractorPro|EyeNetIE|FlashGet|GetRight|GetWeb!|Go!Zilla|Go-Ahead-Got-It|GrabNet|Grafula|HMView|Go!Zilla|Go-Ahead-Got-It)
1;
     
~*(rafula|HMView|HTTrack|Stripper|Sucker|Indy|InterGET|Ninja|JetCar|Spider|larbin|LeechFTP|Downloader|tool|Navroad|NearSite|NetAnts|tAkeOut|WWWOFFLE)
1;
     
~*(GrabNet|NetSpider|Vampire|NetZIP|Octopus|Offline|PageGrabber|Foto|pavuk|pcBrowser|RealDownload|ReGet|SiteSnagger|SmartDownload|SuperBot|WebSpider)
1;
     
~*(Teleport|VoidEYE|Collector|WebAuto|WebCopier|WebFetch|WebGo|WebLeacher|WebReaper|WebSauger|eXtractor|Quester|WebStripper|WebZIP|Wget|Widow|Zeus)
1;
     
~*(Twengabot|htmlparser|libwww|Python|perl|urllib|scan|Curl|email|PycURL|Pyth|PyQ|WebCollector|WebCopy|webcraw)
1;
    }
    server {
        listen       8080;
        server_name  localhost;
                    root   /usr/share/nginx/html;
        server_tokens off;
        location ~ /index.html|.*\.json$ {  # Don't cache index.html and
*json files
	expires -1;
	add_header Cache-Control 'no-store, no-cache, must-revalidate,
proxy-revalidate, max-age=0';
        include /etc/nginx/security-headers.conf;
        }
        location ~ .*\.css$|.*\.js$ {
          add_header Cache-Control 'max-age=31449600'; # one year as we
don't care about this files because of cache boosting
          include /etc/nginx/security-headers.conf;
        }
        location / {
          try_files $uri$args $uri$args/ /index.html;   # Will redirect all
non existing files to index.html. TODO: Is this what we want?
          add_header Cache-Control 'max-age=86400'; # one day
          include /etc/nginx/security-headers.conf;
        }
    }
}
-----------------------------------------------------------------------------------------------------
I need to send all crawlers on the list to a puppeteer server that will
render the webpage and return the static html. I'm able to achive that with
this configuration:
        proxy_pass http://localhost:3000/puppeteer/download/html/;
	proxy_method GET;
	proxy_set_header content-type "application/json";
	proxy_pass_request_body off;
	proxy_set_body "{\"url\":\"https://example.com/$uri\"}";
What I'm not able, is to use proxy_pass with an if statement because it was
deprecated some time ago. 
nginx: [emerg] "proxy_pass" cannot have URI part in location given by
regular expression, or inside named location, or inside "if" statement, or
inside "limit_except" block in /usr/local/etc/nginx/nginx.conf:
So the question would be, what configuration would be needed in order to
redirect the crawlers (based on $http_user_agent) to puppeteer modifying the
body?
Thank you very much!
Posted at Nginx Forum: https://forum.nginx.org/read.php?2,290773,290829#msg-290829
    
    
More information about the nginx
mailing list