利用AWK来分析nginx日志

分析nginx日志有各种各样的可视化工具,但是这样比较繁琐,需要安装和配置,大部分的时候我们只需要简单的分析,这里awk 完全可以满足我们的需求。

  1. 统计日志中访问最多的10个ip

 方法一

awk '{a[$1]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

方法二

awk '{print $1}' access.log |sort |uniq -c |sort -k1 -nr |head -n10

 

2. 统计日志中访问大于100次的IP

方法一

awk '{a[$1]++}END{for(i in a){if(a[i]>100)print i,a[i]}}' access.log

方法二

awk '{a[$1]++;if(a[$1]>100){b[$1]++}}END{for(i in b){print i,a[i]}}' access.log

3. 统计2019年12月24日一天内访问最多的10个IP

方法一

awk '$4>="[24/Dec/2019:00:00:01" && $4<="[24/Dec/2019:23:59:59" {a[$1]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

方法二

sed -n '/\[24\/Dec\/2019:00:00:01/,/\[24\/Dec\/2019:23:59:59/p' access.log |sort |uniq -c |sort -k1 -nr |head -n10

4. 统计访问最多的前10个页面 ($request)

awk '{a[$7]++}END{for(i in a)print a[i],i|"sort -k1 -nr|head -n10"}' access.log

5. 统计蜘蛛抓取次数

grep 'Baiduspider' access.log |wc -l

统计蜘蛛抓取404的次数

grep 'Baiduspider' access.log |grep '404' | wc -l

 

 

Cloudflare 配合nginx 来禁止某些国家, user agent 的访问

其实目前cloudflare 的免费版已经在control panel 里面可以完美的实现这些功能,但是cloudflare 能做到的只是禁止某些国家或者user agent 的访问,如果我们想更好的优化这些流量,比如说对不同的geo跳转到不同的网站,或者对于某些蜘蛛来说显示不同的网站,那么就需要nginx 的配合了, 在这里我们主要利用的是nginx 的map 这个功能.

对于geo 的控制

如果你的网站在cloudflare 的保护下,那么cloudflare 默认会在header里面加上’HTTP_CF_IPCOUNTRY’, 这就相当于cloudflare 提供了免费的IP数据库,利用nginx 的map 功能,比如说禁止来自US, CA, UK, AU的流量, 那就就可以按照如下配置:

在nginx 的全局中

map $http_cf_ipcountry $allow {
default yes;
US no;
CA no;
UK no;
AU no;
}

在server中

if ($allow = no) {
return 403;
}

在这里需要注意的是,按照nginx 的官方文档,map 只能位于http block里面,return 就比较自由了,可以在server,location block里面

深入一点,return的可以不仅仅是403,还可以是301, 302等等,比如说把US,CA,UK和 AU 的流量跳转到google,可以这么配置:

if ($allow = no) {
return 301 https://www.google.com;
}

在更加深入一点,活用map功能,我们可以不仅仅是map GEO,还可以map user agent等等,这里就需要正则的配合了.

Centos 7 编译安装nginx

基本说明见这里: https://www.iamhippo.com/2019-12/1167.html

1) update system and install building software

yum clean all
yum update -y

disable selinux

vi /etc/selinux/config
reboot
yum groupinstall -y 'Development Tools'

2) add nginx username and group, identical to the one nginx offical repo creates:

useradd --system --home /var/cache/nginx --shell /sbin/nologin --comment "nginx user" --user-group nginx

 check user and group created

3) download nginx dependencies source code

cd ~
wget https://ftp.pcre.org/pub/pcre/pcre-8.43.tar.gz && tar xzvf pcre-8.43.tar.gz
wget https://www.zlib.net/zlib-1.2.11.tar.gz && tar xzvf zlib-1.2.11.tar.gz
wget http://www.openssl.org/source/openssl-1.1.1d.tar.gz && tar xzvf openssl-1.1.1d.tar.gz
wget https://nginx.org/download/nginx-1.16.1.tar.gz && tar zxvf nginx-1.16.1.tar.gz
cd nginx-1.16.1
./configure --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-pcre=../pcre-8.43 --with-pcre-jit --with-zlib=../zlib-1.2.11 --with-openssl=../openssl-1.1.1d --with-openssl-opt=no-nextprotoneg --with-debug
make
make install

# go to home

cd ~

# symlink /usr/lib64/nginx/modules to /etc/nginx/modules

ln -s /usr/lib64/nginx/modules /etc/nginx/modules
mkdir /var/cache/nginx -p
mkdir /etc/nginx/vhost -p
vi /usr/lib/systemd/system/nginx.service
[Unit]
Description=nginx - high performance web server
Documentation=http://nginx.org/en/docs/
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t -c /etc/nginx/nginx.conf
ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID

[Install]
WantedBy=multi-user.target

reload systemd daemon

systemctl daemon-reload

start service and auto boot

systemctl start nginx
systemctl enable nginx

check if nginx startup on reboot

systemctl is-enabled nginx.service

check if nginx is running

systemctl status nginx

Nginx config

对于在centos 7, debian 9/10 上自己编译的nginx来说,默认的nginx 配置有点弱,于是根据军哥的LNMP的配置,我做了一些修改. 以后凡是自己按照nginx官方repo 的configure编译的nginx,都可以使用如下的nginx.conf,需要在nginx.conf 所在的目录设置一个vhost目录, 所有的individual host 的配置,都放在vhost里面,方便管理. 

user  nginx nginx;

worker_processes auto;
worker_cpu_affinity auto;

error_log  /var/log/nginx/error.log  crit;

pid        /var/run/nginx.pid;

#Specifies the value for maximum file descriptors that can be opened by this process.
worker_rlimit_nofile 51200;

events
    {
        use epoll;
        worker_connections 51200;
        multi_accept off;
        accept_mutex off;
    }

http
    {
        include       mime.types;
        default_type  application/octet-stream;

        server_names_hash_bucket_size 128;
        client_header_buffer_size 32k;
        large_client_header_buffers 4 32k;
        client_max_body_size 50m;
		
        proxy_buffer_size 128k;
        proxy_buffers 4 256k;
        proxy_busy_buffers_size 256k;

        sendfile on;
        sendfile_max_chunk 512k;
        tcp_nopush on;

        keepalive_timeout 60;

        tcp_nodelay on;

        fastcgi_connect_timeout 300;
        fastcgi_send_timeout 300;
        fastcgi_read_timeout 300;
        fastcgi_buffer_size 64k;
        fastcgi_buffers 4 64k;
        fastcgi_busy_buffers_size 128k;
        fastcgi_temp_file_write_size 256k;

        gzip on;
        gzip_min_length  1k;
        gzip_buffers     4 16k;
        gzip_http_version 1.1;
        gzip_comp_level 2;
        gzip_types     text/plain application/javascript application/x-javascript text/javascript text/css application/xml application/xml+rss;
        gzip_vary on;
        gzip_proxied   expired no-cache no-store private auth;
        gzip_disable   "MSIE [1-6]\.";

        #limit_conn_zone $binary_remote_addr zone=perip:10m;
        ##If enable limit_conn_zone,add "limit_conn perip 10;" to server section.

        server_tokens off;
        access_log off;

server
    {
        listen 80 default_server;
        #listen [::]:80 default_server ipv6only=on;
        server_name _;
        index index.html index.htm;
        root  html;

        #error_page   404   /404.html;

        # Deny access to PHP files in specific directory
        #location ~ /(wp-content|uploads|wp-includes|images)/.*\.php$ { deny all; }

        #include enable-php.conf;

        location /nginx_status
        {
            stub_status on;
            access_log   off;
        }

        location ~ .*\.(gif|jpg|jpeg|png|bmp|swf)$
        {
            expires      30d;
        }

        location ~ .*\.(js|css)?$
        {
            expires      12h;
        }

        location ~ /.well-known {
            allow all;
        }

        location ~ /\.
        {
            deny all;
        }

#       access_log  /home/wwwlogs/access.log;
    }
include vhost/*.conf;
}

Debian 9/10 编译安装nginx

Debian 9, Debian 10 都适用于此教程.

实际上,Nginx 的官方repo编译的nginx,已经把能加上的module全部都加上了,因此在一般情况下,建议使用nginx的官方repo来安装nginx. 但是如果说你想添加第三方的module,或者使用最新的openssl 的话,在或者更改一下nginx 的安装路径的话,就需要自己编译了. 此篇教程尽量按照nginx官方repo的configure来编译安装openssl.

在一台全新安装的Debian 9或者Debian 10上:

更多

Debian 9 通过nginx repo 安装的nginx的配置解析

鉴于目前很多cloud 和 vps 的服务商还不提供debian 10 的模板,因此这里就先研究debian 9 的

root@vultr:~# nginx -V
nginx version: nginx/1.17.6
built by gcc 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)
built with OpenSSL 1.1.0k 28 May 2019 (running with OpenSSL 1.1.0l 10 Sep 2019)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -fdebug-prefix-map=/data/builder/debuild/nginx-1.17.6/debian/debuild-base/nginx-1.17.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie'

debian 9 自带openssl

root@vultr:~# openssl version
OpenSSL 1.1.0l 10 Sep 2019

debian 9 自带openssl的版本正是1.1.0l, 但是编译的是用1.1.0k

debian 9 用apt install gcc 以后,得到的版本是:

root@vultr:~# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

正是我们用来编译nginx的版本,看来nginx 在debian 9下的编译,都是使用repo 自带的版本,没有使用任何特殊的版本

下面看下user group:

nginx:x:110:113:nginx user,,,:/nonexistent:/bin/false

看看user group history:

root@vultr:/var/log# grep nginx auth.log
Dec 11 09:08:55 vultr groupadd[1614]: group added to /etc/group: name=nginx, GID=113
Dec 11 09:08:55 vultr groupadd[1614]: group added to /etc/gshadow: name=nginx
Dec 11 09:08:55 vultr groupadd[1614]: new group: name=nginx, GID=113
Dec 11 09:08:55 vultr useradd[1620]: new user: name=nginx, UID=110, GID=113, home=/nonexistent, shell=/bin/false
Dec 11 09:08:55 vultr chage[1625]: changed password expiry for nginx
Dec 11 09:08:55 vultr chfn[1628]: changed user 'nginx' information

可以看到nginx user 和 group 被加入了system user 和group,因为从/etc/login.defs 我们可以看到:

# Min/max values for automatic uid selection in useradd
#
UID_MIN 1000
UID_MAX 60000
# System accounts
#SYS_UID_MIN 100
#SYS_UID_MAX 999

#
# Min/max values for automatic gid selection in groupadd
#
GID_MIN 1000
GID_MAX 60000
# System accounts
#SYS_GID_MIN 100
#SYS_GID_MAX 999

看看nginx service 的配置:

root@vultr:~# nano /lib/systemd/system/nginx.service

[Unit]
Description=nginx - high performance web server
Documentation=http://nginx.org/en/docs/
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/nginx.pid
ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID

[Install]
WantedBy=multi-user.target

这个systemd 的配置和centos 的一模一样

Nginx 前挂Cloudflare 后的日志分析

将nginx服务器隐藏在cloudflare 服务后端,在nginx 的默认access log里面显示的IP 都是cloudflare,因此我们需要把日志中的访问IP改成真正的用户IP. 有很多种办法可以实现,但是下面的这种办法应该是最简单的.

Cloudflare 用X-Forwarded-For这个header 来传递用户的真实IP,因此我们只需要在nginx 的conf中,设置一个新的nginx log format就可以.

默认的nginx access log format 是:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"'; 

我们可以添加一个新的log format:

log_format csf '$http_x_forwarded_for - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

access log 可以设置成类似于这样的:

access_log /your/access/log/path csf;

这样,用户的真实IP就会展现在access log里面了

Centos 7 通过nginx repo 安装的nginx的配置解析

[root@vultr ~]# nginx -V
nginx version: nginx/1.16.1
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
built with OpenSSL 1.0.2k-fips 26 Jan 2017
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'

我们可以看到nginx官方就是使用的Centos/RHEL官方自带的repo里的gcc来编译的, gcc 版本是4.8.5, openssl 也是官方repo里面自带的1.0.2k的版本(1.1.1以后的版本支持更多的加密方式)

因此nginx 的配置路径为:

nginx path prefix: "/etc/nginx"
nginx binary file: "/usr/sbin/nginx"
nginx modules path: "/usr/lib64/nginx/modules"
nginx configuration prefix: "/etc/nginx"
nginx configuration file: "/etc/nginx/nginx.conf"
nginx pid file: "/var/run/nginx.pid"
nginx error log file: "/var/log/nginx/error.log"
nginx http access log file: "/var/log/nginx/access.log"
nginx http client request body temporary files: "/var/cache/nginx/client_temp"
nginx http proxy temporary files: "/var/cache/nginx/proxy_temp"
nginx http fastcgi temporary files: "/var/cache/nginx/fastcgi_temp"
nginx http uwsgi temporary files: "/var/cache/nginx/uwsgi_temp"
nginx http scgi temporary files: "/var/cache/nginx/scgi_temp"

下面看看usergroup

vi /etc/password
nginx:x:997:995:nginx user:/var/cache/nginx:/sbin/nologin

看看user group history

[root@vultr ~]# grep nginx /var/log/secure
Dec 7 11:36:28 vultr groupadd[1162]: group added to /etc/group: name=nginx, GID=995
Dec 7 11:36:28 vultr groupadd[1162]: group added to /etc/gshadow: name=nginx
Dec 7 11:36:28 vultr groupadd[1162]: new group: name=nginx, GID=995
Dec 7 11:36:28 vultr useradd[1167]: new user: name=nginx, UID=997, GID=995, home=/var/cache/nginx, shell=/sbin/nologin

看看nginx service

vi /usr/lib/systemd/system/nginx.service

[Unit]
Description=nginx - high performance web server
Documentation=http://nginx.org/en/docs/
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/nginx.pid
ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID

[Install]
WantedBy=multi-user.target

 

 

Mysql 快速备份和恢复

shell> mysqldump db1 > dump.sql
shell> mysqladmin create db2
shell> mysql db2 < dump.sql

Do not use –databases on the mysqldump command line because that causes USE db1 to be included in the dump file, which overrides the effect of naming db2 on the mysql command line.

有的时候root的密码不为空,这个时候我们就需要:

shell> mysqldump -uroot -p db1 > dump.sql

这个我们就可以把db1的表给倒出来了