--- title: How to automatically deploy Python web application date: 2024-12-02 10:56:17.773501 UTC --- Recently I saw a question from a Python fellow, how to deploy Django application without manually SSH to server and run commands. This style is single-server deployment, where you put all components, from the application code, database, to static, media files, in the same server. No Docker involves. With this deployment strategy, we will need some way to deliver new version of our app everytime new code is pushed to the "release" branch of Git repository. Here is a guide. Why we need automation? Because it is boring to do these things by hand again and again: - SSH to the server, `cd` to the installation folder. - Run `git pull`. - Run commands to stop your services. - Check and install Python packages if needed. - Run database migration. - Check and copy static files (CSS, JS, images) to a public folder for Nginx. - Run commands to start your services. - And mores, depending on how complex your project is. There are some tools to do this automatically. My prefered one is [Ansible](https://pypi.org/project/ansible/). Before I dive to a detailed Ansible script, let set a convention of how our application is setup on the server: ## 1. Folder layout The installation folder is */opt/ProjectName*, where the tree structure is: ``` ProjectName/ ├── project-name/ │ ├── pyproject.toml │ └── manage.py ├── public/ └── venv/ ``` Inside this folder, we have *project-name* subfolder for the source code, the *public* folder for JS, CSS or any files which are to be hosted by Nginx. Because this is a Python project, we also have *venv* for Python virtual environment. Why come up with this scheme? - People who adopt single-server deployment often run multiple applications in the same server. So having a folder to gather every files for a project is easier to maintain. - *venv* is outside source code folder to prevent us from copy / zip it by accident when we need to copy / move the project to somewhere else. And when we need to do some tasks to scan / search our source code, we don't waste time scanning the "venv". By putting "venv" as the sibling folder, it is quick to activate the environment with this command: ```shell-session $ . ../venv/bin/activate ``` - *public* folder is there with the same reason as *venv*. Note that, you need to set appropriate permission for */opt/ProjectName/public* so that Nginx can serve files from it. ## 2. Process management To let my application automatically run when server starts, I use [`Systemd`](https://wiki.archlinux.org/title/Systemd). While other developers use tools like [`supervisord`](http://supervisord.org/), [`pm2`](https://pm2.keymetrics.io/), I use Systemd because: - It is pre-included in Linux system. No need to install. - Linux server uses it to manage all other system processes. It is good to use a central tool. No need to remember the commands of other tools. - It can start / stop services in order. We want, for example, our application to run after the database systems have started, or when we reboot the server, we want to our application to stop before the database systems are stopped. It is meaningless to run our application when databases (PostgreSQL, Redis) are not available, right? - Because `systemd` is used for managing system, it is very powerful. It supports more use cases to control when our application can be run or when our application needs to be restarted (like when our application hangs due to some mysterious bugs). - It controls the security context for our application, like what resource and how much our application can use. It is needed to prevent disaster when our application is compromised. - Under `systemd`, our application can integrate with `journald` for logging and enjoy `journald` features when debugging with the logs. My other article about `journald` is [here](https://quan.hoabinh.vn/post/2021/8/khoi-dau-du-an-python-nhu-the-nao-de-thuan-tien-phat-trien-len). But we won't take advantage of `systemd` if we run our application in Docker container, simply because Docker container cannot run `systemd`. Docker has some features near to *systemd*, but not as rich and precise as *systemd*. To use *systemd*, we will need to create a *.service* file like this: ```systemd [Unit] Description=Our web backend After=redis-server.service postgresql.service [Service] User=dodo Group=dodo Type=simple WorkingDirectory=/opt/ProjectName/project-name # Create directory /run/project-name and set appropriate permission RuntimeDirectory=project-name ExecStart=/opt/ProjectName/project-name/bin/gunicorn project.wsgi -b unix:/run/project-name/web.sock TimeoutStopSec=20 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target ``` In this service file, you can see: - The application is run with normal user. Don't run it with powerful user, or if your application is compromised, the hacker can use it to damage more to the system. - Our application is not listening on a TCP port (like 8000), but a Unix domain socket, via the file */run/project-name/web.sock*. Why not using numeric port? Because if we have many projects, we cannot remember which port is of which project. Having a text-based thing with name is easier to manage. - When we use Unix domain socket, it is important to not forget the `RuntimeDirectory`. It tells systemd to create a directory where our application can create socket file, and systemd will delete it after our application stopped. This *service* file should be copied to */usr/local/lib/systemd/system*. Some articles tell to put the file to */etc/systemd*. Don't do that, because sometimes we don't want our application to be auto-started (like when it has some bugs and need to be fixed before serving users), we then can enable/disable the auto-start by these commands: ```shell-session $ sudo systemctl disable my-app $ sudo systemctl enable my-app ``` When our app listens on Unix domain socket, the Nginx configuration will be like this: ```nginx location / { include proxy_params; proxy_pass http://unix:/run/project-name/web.sock; } ``` We can take advantage of Unix domain socket further by connecting to PostgreSQL via Unix domain socket only. By doing so, we can stop PostgreSQL from opening to outside, reducing chance of being attacked. My other article with this practice is [here](https://quan.hoabinh.vn/post/2015/10/truy-cap-nhanh-giao-dien-dong-lenh-cua-postgresql). For the tasks which need to run periodically, we should implement it with Systemd timer, instead of cron job. It has these benefits over cron: - Controlled by systemd like other services, meaning it is secure, it is itegrated with `journald` for logging. - We can temporarily disable the tasks, with ``` sudo systemctl stop my-task.timer ``` - We can trigger the task manually any time, outside the time point we defined, with ``` sudo systemctl start my-task.service ``` - We can see when was the last time our task ran, and when will it run next, with ``` sudo systemctl list-timers ``` Below is an example with one of my projects: ![sytemd-timer](https://quan-images.b-cdn.net/blogs/imgur/2026/lSBixXq.png) ## 3. Ansible script Ansible is easy to install. Just do: ```shell-session $ sudo apt install ansible ``` Ansible is so powerful that its documentation is big and difficult to get started. To simplify, we will need at least two files: - An [inventory](https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html) file, let name it *inventory.yml*, to list the servers we will deploy app to. - A [playbook](https://docs.ansible.com/ansible/latest/playbook_guide/index.html) file, let name it *playbook.yml*, to describe the steps that Ansible need to do to deploy our app. In a more complex setup, the playbook can be many files, subfolders, and also the inventory. Note that, you must setup your server beforehand so that we can SSH using public keys, not password. ### Inventory The *inventory.yml* looks like this: ```yml prod: hosts: prod.agriconnect.vn: ansible_user: dodo staging: hosts: staging.agriconnect.vn: ansible_user: dodo ``` In this inventory, we have two groups, `prod` for production and `staging` for staging servers. If you don't have staging servers, just delete the `staging` group. Each group must have `hosts` field to list the servers. To identify the server, you can use domain, or IP address. We also need to specify `ansible_user`, which is the Linux user of the server (not our local PC) that we often SSH to (it can be the same user under whom our web application runs). ### Playbook *Note: The following playbook is based on very old Ansible API, because I've used Ansible since 2016 and don't often get update. I will try to update this script later.* The *playbook.yml* file will look like this: ```yml --- - hosts: '{{ target|default(staging) }}' remote_user: dodo # This is needed to make ansible_env work gather_facts: true gather_subset: - '!all' vars: target: staging tasks: - name: Say hello ansible.builtin.command: echo Hello environment: VIRTUAL_ENV: '/opt/ProjectName/venv' ``` At the `hosts:` parameter, we choose which group of server in *inventory.yml* to run this playbook on. If we have only one group, we can use a fixed value here. But because we have two, we use [Jinja](https://jinja.palletsprojects.com/en/stable/templates/) code to produce dynamic value. This value depends on the `target` variable which we declare in `vars` section, and we pass its value from command line when running Ansible. Later on, if we want to deploy to prod servers, we run: ```shell-session $ ansible-playbook -i inventory.yml playbook.yml -e "target=prod ansible_become_pass=$REMOTE_USER_PASS" ``` and if we want to deploy to staging servers, we run: ```shell-session $ ansible-playbook -i inventory.yml playbook.yml -e "target=staging ansible_become_pass=$REMOTE_USER_PASS" ``` The commands which Ansible will execute on the server will need some info, like file paths, directory paths, so let define them as variables, to make the commands short: ```yml vars: target: staging base_folder: /opt/ProjectName webapp_folder: '{{ base_folder }}/project-name' bin_folder: '{{ base_folder }}/venv/bin/' ``` The `tasks` section then is: ```yml tasks: - name: Clean source ansible.builtin.command: git reset --hard args: chdir: '{{ webapp_folder }}' - name: Update source ansible.builtin.git: repo: 'git@gitlab.com:our-company/project-name.git' dest: '{{ webapp_folder }}' version: "{{ lookup('env', 'CI_COMMIT_REF_NAME')|default('develop', true) }}" register: git_out - name: Get changed files ansible.builtin.command: git diff --name-only {{ git_out.before }}..{{ git_out.after }} args: chdir: '{{ webapp_folder }}' register: changed_files when: git_out.changed - name: Stop ProjectName services ansible.builtin.systemd: name='{{ item }}' state=stopped loop: - my-web.service - my-ws-server.service - my-asynctask.service become: true become_method: sudo when: - git_out.changed - changed_files.stdout is search('.py|.po|.lock|.toml') - name: Install python libs ansible.builtin.command: poetry install --no-root --only main -E systemd args: chdir: '{{ webapp_folder }}' when: git_out.changed and changed_files.stdout is search('poetry|pyproject') - name: Migrate database ansible.builtin.command: '{{ bin_folder }}python3 manage.py migrate --no-input' args: chdir: '{{ webapp_folder }}' when: - git_out.changed - changed_files.stdout is search('poetry|pyproject|models|migrations|settings') - name: Compile translation ansible.builtin.command: '{{ bin_folder }}python3 manage.py compilemessages' args: chdir: '{{ webapp_folder }}' when: - git_out.changed - changed_files.stdout is search('locale') - name: Collect static ansible.builtin.command: '{{ bin_folder }}python3 manage.py collectstatic --no-input' args: chdir: '{{ webapp_folder }}' - name: Start ProjectName services ansible.builtin.systemd: name='{{ item }}' state=started loop: - my-ws-server.service - my-web.service - my-asynctask.service become: true become_method: sudo when: git_out.changed ``` The first step ("Clean source"), we reset any dirty changes in our source code folder, to prevent Git failure which may raise in the next step. The second step, we pull the code from a Git hosting service, to the version we define for this deployment. In my project, I often use `main` branch for stable code, to deploy to production, and use `develop` branch for staging. You can put a fixed branch name at `version`, like `version: main` if you have simpler setup. In my case, Ansbile is triggered by Git push, and I want to pull the code of the exact Git revision which triggerred the deployment task. GitLab CI gives this info via `CI_COMMIT_REF_NAME`, so I use `lookup('env', 'CI_COMMIT_REF_NAME')` code to retrieve it. The code `default('develop', true)` is to fallback to `develop` branch when we manually run Ansible from command line (not via Git push). We use the `register` parameter to save Git result, it is needed by the next step. The third step, we check which files have been changed since the last deployment. Later, we will base on this info to decide which command to skip. Step 4, we stop our application services, which are corresponding to the `systemd` *.service* files. This example demonstrates the use of `loop` to do action on many objects. Without it, we have to define a step for each service, which makes the playbook long. One more thing to note that, because the `systemctl start / stop` actions need to be run with `sudo`, we use `become` and `become_method` parameters to tell Ansible to do `sudo`. Here, we also use `when` to define the condition when we need to execute this step. If the source code only changes some JS, CSS files, we don't need to stop the services. The next steps may be already easy to understand, from the explanation of the first four steps. The third section of the playbook is to set some environment variables which the above steps need: ```yaml environment: # Modify PATH so that poetry can be found. PATH: '{{ base_folder }}/venv/bin:{{ ansible_env.PATH }}:{{ ansible_env.HOME }}/.local/bin' # Tell poetry to use our virtual env folder VIRTUAL_ENV: '{{ base_folder }}/venv' ``` Before, in some server configuration, the `PATH` was not populated with `~/.local/bin` in the environment which Ansible logged in to, hence Ansible failed to run `poetry` at the 5th step. I'm not sure if it is still an issue. We won't be able to write a correct playbook from the first time. So please setup a virtual machine with [VirtualBox](https://www.virtualbox.org/) to test and fix the playbook. When testing with virtual machine, we won't have Git push event, we have to run Ansible directly from the CLI. In the commands I given above, `REMOTE_USER_PASS` is the environment variable, containing the password of the user on the server. You can set it with ```sh export REMOTE_USER_PASS=mypassword ``` before running Ansible. Now is a screenshot of Ansible in action, from one of my projects: ![Ansible in action](https://quan-images.b-cdn.net/blogs/imgur/2026/nvzu30j.png) And how it is run as part of GitLab Pipeline: ![GitLab Pipeline](https://quan-images.b-cdn.net/blogs/imgur/2026/rEuSX9Ll.png) So, I have a short guide on how to take advantage of Ansible and SystemD to deploy a Django application. Hope that it make your life easier.