Our web is often written in Python, and to run the web app on production, we often use Gunicorn. Its log is also a resource for incident investigation. But the log of bot visits is so noisy. How to exclude them?
When running Gunicorn, we often have a config file for Gunicorn. We often name it gunicorn_conf.py, with content like this:
 = 
 = 6
 = 
# Make short log line. Some info is discarded, because it is shown by journalctl already.
 = 
To tell Nginx not to log visits of bots, we will manipulate Gunicorn logger object. First, define a function to identify bots (search bots and crawling bots) and a logger filter class:
     = 
    return 
         = 
        # Ref: https://docs.gunicorn.org/en/stable/settings.html#access-log-format
         = 
         = 
        return  and not 
Then, we inject the code of setting logger into Gunicorn's on_starting hook:
    
Done. If you let Gunicorn controlled by systemd, you can use systemctl to tell Gunicorn to re-read new config (given that your systemd unit file is our-web.service):
$ sudo systemctl reload our-web.service
Gunicorn's way of using Python script for configuration looks weird as first. But in some situation, like this case, it is an advantage.