Taming the Multi-Repo Lion

Aug 6, 2024

•

0 Min Read

Elia Karagiannis

Engineer

Contents

Heading 2

Heading 3

Get updates into your inbox

Mono or multi repo? Is there even a correct answer? Both have pros and cons. But the reality is that both multi and mono repos require tooling to make teams effective with them.

This blog post will not go down the rabbit hole of which is better — there are plenty of blog posts and YouTube videos on this topic (watch this video as an example). Instead we’ll explain how Metronome, which uses multiple repositories (multi repos) in our day-to-day development, utilizes tooling to ensure an efficient development environment.

Metronome’s services

For the purpose of this post we will only look at a slice of the services that make up Metronome. We will also look at simplified deployments of these services so they can run efficiently locally.

Databases

Metronome uses multiple Postgres databases to service the product. For most local development, only the metadata database is required. This Postgres database stores our clients and all their pricing and packaging metadata. It also stores an ever-growing list of customers and invoices.

We also have databases to store our usage data, but starting this locally is out of scope for this post.

Data Model services

Metronome uses a federated GraphQL service layer to control all access to our data. All of our external surfaces (frontend, API) and internal async processing (Kafka / SQS queue handlers) talk to the underlying GraphQL service.

We have four core data model services:

Hasura: A GraphQL ORM on top of our metadata database
Invoice service: Controls everything invoice generation, from retrieving data at generation time to retrieving and storing invoice and pricing metadata
Client service: Controls access to clients, customers, and integrations
GraphQL gateway: A GraphQL federation service which composes a GraphQL supergraph of the above services

UI services

Our clients mainly interact with Metronome through APIs, but we also have a rich user interface. Metronome has a react frontend, which powers user flows. Metronome also has a documentation frontend, which you can see at https://docs.metronome.com.

Async processing

The majority of Metronome processing happens asynchronously. For example, the majority of our clients use end-of-month billing. Metronome uses distributed queues and serverless functions to asynchronously process these invoices.

How does a developer work with this complexity?

As Metronome grew, the initial few services expanded to support the new load. When this was the case, a simple'clone this repo', 'npm install','npm start' was enough. Using dotenv files to differentiate environment variables allowed overriding local behavior and using docker compose to pull up local databases.

But as more Metrognomes (employees) joined and more services were created, more and more 'npm starts' were required. More terminal windows were required to start these new services. Sure tmux and terminals with multiple tabs helped, but it was still a chore to start everything up.

Engineers started creating their own zsh scripts to make the local development process easier. Scripts to pull down repos, start services, update them. Scripts to call out to docker compose to ensure the database was up and ready.

Tilt to the rescue

Tilt is a development tool to simplify multi-service local development. Instead of using custom scripts to orchestrate and run multiple services, Tilt lets you define the scripts for starting services.

Our main Tilt configuration lives in a 'ocal-development' repository. This 'Tiltfile' can perform tasks (like cloning required repos) and imports other repos’ 'Tiltfile's.

A 'Tiltfile' can run a local script (like populating a database) or a local service (by running local commands like 'npm install && npm run dev' ).

The main tilt configuration can also optionally start services. For example, if you want to do frontend development locally but use a staging backend, 'tilt up frontend-staging-be'. If you want to run everything locally, including databases, GraphQL, and api layers, 'tilt up frontend-full-stack'.

Powerful features of Tilt

Docker compose built in

Tilt works great with existing docker-compose files. We already had a docker-compose.yml file for our local databases. To enable these with Tilt, we just needed to call the 'docker_compose' Tilt resource. Just like that, our databases are started and show up as services in the Tilt UI. More importantly, other Tilt services can take a dependency on the local databases being up and ready.

Here’s a simple example of our Tiltfile for our databases:

# run docker-compose pull to get latest images
docker_compose("../docker-compose.yml")

dc_resource(
  name='postgres-metadata',
  labels=['databases']
)

local_resource(
  name='postgres-metadata-healthcheck',
  labels=['databases'],
  resource_deps=['postgres-metadata'],
  auto_init=True,
  allow_parallel=True,
  cmd='./.dev/pg_db_healthcheck.sh postgres-metadata',
  dir='../'
)

Local visibility

Tilt gives log visibility to all started services:

Each service has an entry in the sidebar, and clicking on a service shows you the service logs.

Also, you can restart a service by clicking the restart icon next to the service.

Auto refresh with Tilt

Tilt has a built-in file system watching and can take actions based on it. For example, if you change a package.json file, Tilt will run 'npm install' automatically and restart the service.

Example Tiltfile:

local_resource(
  'frontend-npm-install',
  dir="../",
  cmd='npm install',
  deps=['../package.json'],
  allow_parallel=True,
  labels=['frontend']
)

local_resource(
  'frontend',
  serve_dir="../",
  deps=['../.env.development', '../webpack.config.js'],
  resource_deps=['frontend-npm-install'],
  serve_cmd='npm start',
  allow_parallel=True,
  links=[
    '<http://localhost>:%s/' % frontend_port
  ],
  readiness_probe=probe(
    period_secs=5,
    exec=exec_action([probe_path, frontend_port])
  ),
  labels=['frontend']
)

Tilt also supports a 'readiness_probe', which can be used to control dependent services. For example, a service can take a dependency on the databases being up and ready.

Conditionally starting services

Sometimes we want to start the code locally but tunnel it to our staging backend. Just like we have'tilt up frontend-full-stack', we have configured Tilt to have 'tilt up frontend-full-stack --backend=staging' to skip starting local databases and creating a tunnel to our staging databases. Tilt also allows overriding environment variables so the services use the tunnel instead of the local databases running in docker.

Local data initialization

With Tilt, we have created local resources that initialize our local databases. These scripts can make direct database calls or, preferably, call our local GraphQL resolvers to initialize data.

Because Tilt has a service dependency graph, the initialization code only runs once required databases and services are up and running.