9/25 Migrations

Commit Hash: 9c94f6f

In this lab, we’ll be creating a migrator service for Jarvis. Since our backend microservices all share a single database, this structure will make the most sense for us. Additionally, when we start using Kubernetes, having this split off as a separate service will be much more convenient to use. We’ll be using alembic for migrations, but the concepts are similar across the many libraries that exist. We’ll mainly be adapting the tutorial from its documentation.

The general setup we’ll be aiming for is a migrator service that targets our database running in Docker compose (the Postgres database). We’ll learn how to configure our migrator service, how to create migrations, and how to run migrations.

Alembic Setup

Let’s get started by creating a folder backend/migrator/ and running uv init --name jarvis-migrator within that folder. You can delete the README.md and main.py, we won’t be needing these. Our migrator service will be responsible for running migrations for all tables. First, let’s add the dependencies we need. We’ll need alembic along with sqlalchemy and psycopg2 (the database driver packages). Let’s also add the requirements we need for configs, since our migrator service will need at least one config to determine which database URL to point to. Edit your backend/migrator/pyproject.toml to:

[project]
name = "jarvis-migrator"
version = "0.1.0"
description = "Database migration service for Jarvis"
requires-python = ">=3.10"
dependencies = [
    "sqlalchemy>=2.0.0",
    "alembic>=1.16.4",
    "psycopg2-binary>=2.9.0",
    "pydantic-settings>=2.0.0",
    "python-dotenv>=1.0.0",
    "shared",
]

[tool.uv.sources]
shared = { path = "../utilities" }

Now, let’s get started with alembic! From backend/migrator/, run uv run alembic init migrations. This will initialize an alembic setup within a folder called migrations. Within the migrations folder, there’s a couple of files of note:

File	Description
`alembic.ini`	Main Alembic configuration file. Sets up database connection, script location, and other settings.
`versions/`	Directory where individual migration scripts (versioned migrations) are stored. Each file represents a migration step.
`env.py`	Main Alembic environment file. Handles configuration and sets up the migration context.
`script.py.mako`	Template file used to generate new migration scripts with `alembic revision`.

If you’re curious, take some time to skim through these to check what you can potentially configure. There’s even more you can do, which you can find on the documentation. For our use case, the default setup is good enough and we’ll only need to tweak a few things.

Alembic Configuration

First, let’s add a configuration file that is similar to what we have for the auth and notes service. Create a file backend/migrator/config.py:

from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict


class MigratorSettings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env.local",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore"
    )

    # Database settings
    database_url: str = Field(
        description="Database URL for SQLAlchemy"
    )

settings = MigratorSettings()

Note that this does not inherit from our base settings class since we only really care about a single setting: the database URL. With the config defined, let’s now add the backend/migrator/.env.local that points to our Docker compose database:

DATABASE_URL=postgresql://jarvis:password@localhost:5432/jarvis

Now, we’ll need to set the configuration in alembic to use this database config, along with register the tables we care about. In env.py, add the following after the config = context.config line:

# Import shared database utilities
from shared.database import Base

# Import all models from all services
# HACK: Add the parent directory to sys.path so we can import from services
# This probably isn't something we should do in industry tbh
import sys, os
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from services.auth.app.models.user import User
from services.notes.app.models.notes import Note

# Import settings for database url
from config import settings
if settings.database_url:
    config.set_main_option('sqlalchemy.url', settings.database_url)

Finally, set target_metadata to point to the base we use. This should be defined in the file already, you’ll need to override it:

# Import shared database utilities
from shared.database import Base

target_metadata = Base.metadata

Once you’ve finished the above, we’re done with setup! We can now start by creating our first migration.

Creating Migrations

As mentioned in class, versioned migrations are state transitions from one version of a database schema to another. Currently, our database actually already has some data in it. Ideally, we’d like it so that the first migration we create assumes that the database is starting from a completely empty state. Let’s nuke our current database, which is easy by removing the volume created by docker compose. Run docker volume ls to see what volumes you have, there should be one named something like jarvis-monorepo_db_data. We’ll want to remove this volume, which you can do by running

docker volume rm docker volume rm jarvis-monorepo_db_data

However, there’s still something additional we need to do. Currently, we have this hacky database setup structure where each service will create tables if they don’t exist. This worked fine in dev, but becomes unwieldy as we need to make changes to our schema (e.g. manually need to delete database each time, loss of dev data, etc). Within the main.py of both auth and notes, you’ll see the lines:

@app.on_event("startup")
async def startup_event():
    """Create database tables on startup"""
    create_tables()

We need to delete these lines of code and the import from app.models.db import create_tables. We can also delete the create_tables method in app/models/db.py in both notes and auth since we no longer need it.

Great, now that we’ve removed the docker volume and the lines of code that autocreate tables, we can proceed to create our first migration. Spin up your docker compose environment:

docker compose up --build

Once that’s spun up, you can verify that your database has indeed been nuked by running a postgres shell on the container. From the root folder of the Jarvis monorepo:

docker-compose -f compose.yaml exec db psql -U jarvis -d jarvis

And once you’ve opened a postrges shell, you can run \d, which should output something like:

jarvis=# \d
Did not find any relations.

Now, we can finally generate our first migration. From within the backend/migrator/ folder, run

uv run alembic revision --autogenerate -m "Initial migration"

You should see a new file generated in backend/migrator/migrations/versions/811794f6c62e_initial_migration.py (the hash in front is likely different). Take a look at this file, you’ll see there’s an upgrade() and downgrade() command defined. In upgrade(), alembic has defined migration commands it determined were needed to transition the current state of the database (empty) to what it saw as code in our codebase, e.g. the User table in backend/services/auth/app/models/user.py. Nice, right? The downgrade() command is the opposite of this, and can be used for rollbacks if necessary. Additionally, note that there’s a down_revision variable defined. This is how alembic determines ordering of migrations.

However, we still haven’t actually applied the migration. Let’s apply the migration to our database running in docker compose. From within the backend/migrator/ folder:

uv run alembic upgrade head

Now, you can open a postgres shell again to the docker compose database and check the schema. You’ll see something like:

psql (15.13)
Type "help" for help.

jarvis=# \d
              List of relations
 Schema |      Name       |   Type   | Owner
--------+-----------------+----------+--------
 public | alembic_version | table    | jarvis
 public | notes           | table    | jarvis
 public | notes_id_seq    | sequence | jarvis
 public | users           | table    | jarvis
 public | users_id_seq    | sequence | jarvis
(5 rows)

There’s a few things to note here. First, the syntax we used to apply the migration is heavily inspired from git. In fact, you can change the state of your database to any “revision” that you have. Also, note that alembic created this alembic_version table for us. It’s using the database itself to keep track of which version of the schema it has applied, making it easy for it to determine what migrations to apply.

Final Notes

We added a migrator service to Jarvis that uses alembic to automatically create migrations. It has a configuration so that we can point it to different databases in the future. In the future, we’ll need to create a fairly messy Dockerfile for this service since it needs to import models from all other services. However, the migrator will be very useful in the near future when we add the chat service!

Alembic Setup

Alembic Configuration

Creating Migrations

Final Notes

DevOps