Nginx Archives - 河马的深度解析

Nginx 中用map 语句来判断某个header field是否存在

2021/10/262021/10/26河小马Nginx, OpenrestyLeave a comment

如果一个header field不存在，那么他的值为””. 这种方式仅在header中出现.

所以我们可以用下面的方式来判断某个header field是否存在.

map $http_cf_connecting_ip $client_ip_from_cf {
default $http_cf_connecting_ip;
"" $remote_addr;
}

Nginx 使用多个map条件语句(conditionally block)

2021/10/252021/10/25河小马NginxLeave a comment

在Nginx 的location中，if is evil.

因此多个if条件语句可以转化为多个map条件语句.

一般有两种方式，一个是map中直接map 两个变量，变量之间用:间隔；另外一个就是使用多个map，后一个map里面直接使用前面一个map里的变量，形成map chain. 下面详细说明:

第一种方式:

map "$http_x_target:$arg_target" $destination {
default upstream0;
~something upstream1;
~something2 upstream1;
~something3 upstream2;
}
...
server {
location / {
proxy_pass https://$destination;
}
}

第二种方式:

map $arg_target $arg_destination {
default upstream0;
something upstream1;
something2 upstream1;
something3 upstream2;
}
map $http_x_target $destination {
default $arg_destination;
something upstream1;
something2 upstream1;
something3 upstream2;
}
...
server {
location / {
proxy_pass https://$destination;
}
}

参考文档:

https://stackoverflow.com/questions/59671623/conditionally-map-values-in-nginx-config
https://gock.net/blog/2020/nginx-conditional-logging-responses/

Nginx 禁止某些UA(User Agent)访问

2021/10/252021/10/25河小马NginxLeave a comment

一般来说，我们直接用nginx 的if 语句配合正则表达式就可以了，比如说

# case sensitive matching
if ($http_user_agent ~ (Antivirx|Arian)) {
return 403;
}

# case insensitive matching
if ($http_user_agent ~* (netcrawl|npbot|malicious)) {
return 403;
}

但是当我们需要禁止的user agent lists过长时，用if语句配合正则表达式就不是那么方便，而且性能上也会有影响，因此Nginx官方

多推荐使用map 来代替 if 语句

map $http_user_agent $badagent {
default 0;
~*malicious 1;
~*backdoor 1;
~*netcrawler 1;
~Antivirx 1;
~Arian 1;
~webbandit 1;
}

 if ($badagent) {
return 403;
}

Nginx 中获取cloudflare保护的网站的访问者的真正IP

2021/09/212021/09/21河小马Cloudflare, NginxLeave a comment

一般来说，有三个参数可以使用:

CF-Connecting-IP, 在nginx log中为 cf-connecting-ip.
True-Client-IP, 仅给Cloudflare 企业用户使用，在nginx log中为true-client-ip
X-Forwarded-For, 在nginx log 中比较常见，为x-forwarded-for. X-Forwarded-For 其实是一个数组，按顺序记录了用户的真正的IP和用户使用的Proxy.

举例说明:

X-Forwarded-For: 203.0.113.1

这个表示用户的真实IP是203.0.113.1

X-Forwarded-For: 198.51.100.101,198.51.100.102,203.0.113.1

这个表示用户的真实IP是203.0.113.1，然后依次经过了198.51.100.102和198.51.100.101代理，然后才访问到了CF的edge 节点

综上所述， CF 推荐使用CF-COnnecting-IP 和 True-Client-IP 两种header，因为他们能够保证他们的value只有一个IP

利用AWK来分析nginx日志

2019/12/242019/12/24河小马NginxLeave a comment

分析nginx日志有各种各样的可视化工具，但是这样比较繁琐，需要安装和配置，大部分的时候我们只需要简单的分析，这里awk 完全可以满足我们的需求。

统计日志中访问最多的10个ip

方法一

awk '{a[$1]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

方法二

awk '{print $1}' access.log |sort |uniq -c |sort -k1 -nr |head -n10

2. 统计日志中访问大于100次的IP

方法一

awk '{a[$1]++}END{for(i in a){if(a[i]>100)print i,a[i]}}' access.log

方法二

awk '{a[$1]++;if(a[$1]>100){b[$1]++}}END{for(i in b){print i,a[i]}}' access.log

3. 统计2019年12月24日一天内访问最多的10个IP

方法一

awk '$4>="[24/Dec/2019:00:00:01" && $4<="[24/Dec/2019:23:59:59" {a[$1]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

方法二

sed -n '/\[24\/Dec\/2019:00:00:01/,/\[24\/Dec\/2019:23:59:59/p' access.log |sort |uniq -c |sort -k1 -nr |head -n10

4. 统计访问最多的前10个页面 ($request)

awk '{a[$7]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

5. 统计蜘蛛抓取次数

grep 'Baiduspider' access.log |wc -l

统计蜘蛛抓取404的次数

grep 'Baiduspider' access.log |grep '404' | wc -l

Cloudflare 配合nginx 来禁止某些国家, user agent 的访问

2019/12/242019/12/24河小马Cloudflare, Nginx

其实目前cloudflare 的免费版已经在control panel 里面可以完美的实现这些功能，但是cloudflare 能做到的只是禁止某些国家或者user agent 的访问，如果我们想更好的优化这些流量，比如说对不同的geo跳转到不同的网站，或者对于某些蜘蛛来说显示不同的网站，那么就需要nginx 的配合了, 在这里我们主要利用的是nginx 的map 这个功能.

对于geo 的控制

如果你的网站在cloudflare 的保护下，那么cloudflare 默认会在header里面加上’HTTP_CF_IPCOUNTRY’, 这就相当于cloudflare 提供了免费的IP数据库，利用nginx 的map 功能，比如说禁止来自US, CA, UK, AU的流量, 那就就可以按照如下配置:

在nginx 的全局中

map $http_cf_ipcountry $allow {
default yes;
US no;
CA no;
UK no;
AU no;
}

在server中

if ($allow = no) {
return 403;
}

在这里需要注意的是，按照nginx 的官方文档，map 只能位于http block里面，return 就比较自由了，可以在server，location block里面

深入一点，return的可以不仅仅是403，还可以是301， 302等等，比如说把US，CA，UK和 AU 的流量跳转到google，可以这么配置:

if ($allow = no) {
return 301 https://www.google.com;
}

在更加深入一点，活用map功能，我们可以不仅仅是map GEO，还可以map user agent等等，这里就需要正则的配合了.

Centos 7 编译安装nginx

2019/12/122019/12/12河小马NginxLeave a comment

基本说明见这里: https://www.iamhippo.com/2019-12/1167.html

1) update system and install building software

yum clean all
yum update -y

disable selinux

vi /etc/selinux/config
reboot
yum groupinstall -y 'Development Tools'

2) add nginx username and group, identical to the one nginx offical repo creates:

useradd --system --home /var/cache/nginx --shell /sbin/nologin --comment "nginx user" --user-group nginx

check user and group created

3) download nginx dependencies source code

cd ~
wget https://ftp.pcre.org/pub/pcre/pcre-8.43.tar.gz && tar xzvf pcre-8.43.tar.gz
wget https://www.zlib.net/zlib-1.2.11.tar.gz && tar xzvf zlib-1.2.11.tar.gz
wget http://www.openssl.org/source/openssl-1.1.1d.tar.gz && tar xzvf openssl-1.1.1d.tar.gz

wget https://nginx.org/download/nginx-1.16.1.tar.gz && tar zxvf nginx-1.16.1.tar.gz
cd nginx-1.16.1
./configure --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-pcre=../pcre-8.43 --with-pcre-jit --with-zlib=../zlib-1.2.11 --with-openssl=../openssl-1.1.1d --with-openssl-opt=no-nextprotoneg --with-debug

make
make install

# go to home

cd ~

# symlink /usr/lib64/nginx/modules to /etc/nginx/modules

ln -s /usr/lib64/nginx/modules /etc/nginx/modules
mkdir /var/cache/nginx -p
mkdir /etc/nginx/vhost -p

vi /usr/lib/systemd/system/nginx.service

[Unit]
Description=nginx - high performance web server
Documentation=http://nginx.org/en/docs/
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t -c /etc/nginx/nginx.conf
ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID

[Install]
WantedBy=multi-user.target

reload systemd daemon

systemctl daemon-reload

start service and auto boot

systemctl start nginx
systemctl enable nginx

check if nginx startup on reboot

systemctl is-enabled nginx.service

check if nginx is running

systemctl status nginx

对于在centos 7， debian 9/10 上自己编译的nginx来说，默认的nginx 配置有点弱，于是根据军哥的LNMP的配置，我做了一些修改. 以后凡是自己按照nginx官方repo 的configure编译的nginx，都可以使用如下的nginx.conf，需要在nginx.conf 所在的目录设置一个vhost目录, 所有的individual host 的配置，都放在vhost里面，方便管理.

user  nginx nginx;

worker_processes auto;
worker_cpu_affinity auto;

error_log  /var/log/nginx/error.log  crit;

pid        /var/run/nginx.pid;

#Specifies the value for maximum file descriptors that can be opened by this process.
worker_rlimit_nofile 51200;

events
    {
        use epoll;
        worker_connections 51200;
        multi_accept off;
        accept_mutex off;
    }

http
    {
        include       mime.types;
        default_type  application/octet-stream;

        server_names_hash_bucket_size 128;
        client_header_buffer_size 32k;
        large_client_header_buffers 4 32k;
        client_max_body_size 50m;
		
        proxy_buffer_size 128k;
        proxy_buffers 4 256k;
        proxy_busy_buffers_size 256k;

        sendfile on;
        sendfile_max_chunk 512k;
        tcp_nopush on;

        keepalive_timeout 60;

        tcp_nodelay on;

        fastcgi_connect_timeout 300;
        fastcgi_send_timeout 300;
        fastcgi_read_timeout 300;
        fastcgi_buffer_size 64k;
        fastcgi_buffers 4 64k;
        fastcgi_busy_buffers_size 128k;
        fastcgi_temp_file_write_size 256k;

        gzip on;
        gzip_min_length  1k;
        gzip_buffers     4 16k;
        gzip_http_version 1.1;
        gzip_comp_level 2;
        gzip_types     text/plain application/javascript application/x-javascript text/javascript text/css application/xml application/xml+rss;
        gzip_vary on;
        gzip_proxied   expired no-cache no-store private auth;
        gzip_disable   "MSIE [1-6]\.";

        #limit_conn_zone $binary_remote_addr zone=perip:10m;
        ##If enable limit_conn_zone,add "limit_conn perip 10;" to server section.

        server_tokens off;
        access_log off;

server
    {
        listen 80 default_server;
        #listen [::]:80 default_server ipv6only=on;
        server_name _;
        index index.html index.htm;
        root  html;

        #error_page   404   /404.html;

        # Deny access to PHP files in specific directory
        #location ~ /(wp-content|uploads|wp-includes|images)/.*\.php$ { deny all; }

        #include enable-php.conf;

        location /nginx_status
        {
            stub_status on;
            access_log   off;
        }

        location ~ .*\.(gif|jpg|jpeg|png|bmp|swf)$
        {
            expires      30d;
        }

        location ~ .*\.(js|css)?$
        {
            expires      12h;
        }

        location ~ /.well-known {
            allow all;
        }

        location ~ /\.
        {
            deny all;
        }

#       access_log  /home/wwwlogs/access.log;
    }
include vhost/*.conf;
}

Category: Nginx