Taming the multi repo lion

Aug 6, 2024
 • 
0 MIN READ
Gold share tray icon
Elia Karagiannis
Engineer
Get updates into your inbox
Sample Metronome billing dashboard
Share

Mono or multi repo? Is there even a correct answer? Both have pros and cons. But the reality is that both multi and mono repos require tooling to make teams effective with them.

This blog post will not go down the rabbit hole of which is better — there are plenty of blog posts and YouTube videos on this topic (watch this video as an example). Instead we’ll explain how Metronome, which uses multiple repositories (multi repos) in our day-to-day development, utilizes tooling to ensure an efficient development environment.

Metronome’s services

For the purpose of this post we will only look at a slice of the services that make up Metronome. We will also look at simplified deployments of these services so they can run efficiently locally.

Databases

Metronome uses multiple Postgres databases to service the product. For most local development, only the metadata database is required. This Postgres database stores our clients and all their pricing and packaging metadata. It also stores an ever-growing list of customers and invoices.

We also have databases to store our usage data, but starting this locally is out of scope for this post.

Data Model services

Metronome uses a federated GraphQL service layer to control all access to our data. All of our external surfaces (frontend, API) and internal async processing (Kafka / SQS queue handlers) talk to the underlying GraphQL service.

We have four core data model services:

  • Hasura: A GraphQL ORM on top of our metadata database
  • Invoice service: Controls everything invoice generation, from retrieving data at generation time to retrieving and storing invoice and pricing metadata
  • Client service: Controls access to clients, customers, and integrations
  • GraphQL gateway: A GraphQL federation service which composes a GraphQL supergraph of the above services

UI services

Our clients mainly interact with Metronome through APIs, but we also have a rich user interface. Metronome has a react frontend, which powers user flows. Metronome also has a documentation frontend, which you can see at https://docs.metronome.com.

Async processing

The majority of Metronome processing happens asynchronously. For example, the majority of our clients use end-of-month billing. Metronome uses distributed queues and serverless functions to asynchronously process these invoices.

How does a developer work with this complexity?

As Metronome grew, the initial few services expanded to support the new load. When this was the case, a simple'clone this repo', 'npm install','npm start' was enough. Using dotenv files to differentiate environment variables allowed overriding local behavior and using docker compose to pull up local databases.

But as more Metrognomes (employees) joined and more services were created, more and more 'npm starts' were required. More terminal windows were required to start these new services. Sure tmux and terminals with multiple tabs helped, but it was still a chore to start everything up.

Engineers started creating their own zsh scripts to make the local development process easier. Scripts to pull down repos, start services, update them. Scripts to call out to docker compose to ensure the database was up and ready.

Tilt to the rescue

Tilt is a development tool to simplify multi-service local development. Instead of using custom scripts to orchestrate and run multiple services, Tilt lets you define the scripts for starting services.

Our main Tilt configuration lives in a 'ocal-development' repository. This 'Tiltfile' can perform tasks (like cloning required repos) and imports other repos’ 'Tiltfile's.

A 'Tiltfile' can run a local script (like populating a database) or a local service (by running local commands like 'npm install && npm run dev' ).

The main tilt configuration can also optionally start services. For example, if you want to do frontend development locally but use a staging backend, 'tilt up frontend-staging-be'. If you want to run everything locally, including databases, GraphQL, and api layers, 'tilt up frontend-full-stack'.

Powerful features of Tilt

Docker compose built in

Tilt works great with existing docker-compose files. We already had a docker-compose.yml file for our local databases. To enable these with Tilt, we just needed to call the 'docker_compose' Tilt resource. Just like that, our databases are started and show up as services in the Tilt UI. More importantly, other Tilt services can take a dependency on the local databases being up and ready.

Here’s a simple example of our Tiltfile for our databases:

# run docker-compose pull to get latest images
docker_compose("../docker-compose.yml")

dc_resource(
  name='postgres-metadata',
  labels=['databases']
)

local_resource(
  name='postgres-metadata-healthcheck',
  labels=['databases'],
  resource_deps=['postgres-metadata'],
  auto_init=True,
  allow_parallel=True,
  cmd='./.dev/pg_db_healthcheck.sh postgres-metadata',
  dir='../'
)

Local visibility

Tilt gives log visibility to all started services:

Each service has an entry in the sidebar, and clicking on a service shows you the service logs.

Also, you can restart a service by clicking the restart icon next to the service.

Auto refresh with Tilt

Tilt has a built-in file system watching and can take actions based on it. For example, if you change a package.json file, Tilt will run 'npm install' automatically and restart the service.

Example Tiltfile:

local_resource(
  'frontend-npm-install',
  dir="../",
  cmd='npm install',
  deps=['../package.json'],
  allow_parallel=True,
  labels=['frontend']
)

local_resource(
  'frontend',
  serve_dir="../",
  deps=['../.env.development', '../webpack.config.js'],
  resource_deps=['frontend-npm-install'],
  serve_cmd='npm start',
  allow_parallel=True,
  links=[
    '<http://localhost>:%s/' % frontend_port
  ],
  readiness_probe=probe(
    period_secs=5,
    exec=exec_action([probe_path, frontend_port])
  ),
  labels=['frontend']
)

Tilt also supports a 'readiness_probe', which can be used to control dependent services. For example, a service can take a dependency on the databases being up and ready.

Conditionally starting services

Sometimes we want to start the code locally but tunnel it to our staging backend. Just like we have'tilt up frontend-full-stack', we have configured Tilt to have 'tilt up frontend-full-stack --backend=staging' to skip starting local databases and creating a tunnel to our staging databases. Tilt also allows overriding environment variables so the services use the tunnel instead of the local databases running in docker.

Local data initialization

With Tilt, we have created local resources that initialize our local databases. These scripts can make direct database calls or, preferably, call our local GraphQL resolvers to initialize data.

Because Tilt has a service dependency graph, the initialization code only runs once required databases and services are up and running.

Company Industry Outcome-Based Pricing Model Key Metrics for Pricing Notable Features
Salesforce (Agentforce) CRM / AI Customer Service

$2 per conversation handled by Agentforce (AI agent)

A conversation is defined as when a customer sends at least one message or selects at least one menu option or choice other than the End Chat button within a 24-hour period.

Number of support conversations handled by the AI agent

First major CRM to adopt a "semi"outcome-based pricing for AI; aligns cost with actual support volumes (clear ROI)

Addresses inefficiencies of idle licenses by charging only when value (a handled conversation) is delivered

Intercom (Fin AI) Customer Support Software

$0.99 per successful resolution by "Fin" AI chatbot - clients pay only when the bot successfully resolves a customer query

Fees accrue based on AI-solved issues

Count of support conversations resolved by the AI agent

Early adopter of AI outcome-based pricing in 2023

Lowers adoption risk by charging for resolved queries instead of a flat rate; combines usage- and value-based pricing to tie cost directly to support effectiveness.

Zendesk (AI Answer Bot) Customer Support

Per successful AI chatbot-handled resolution

No charge if the bot fails and a human must step in

Number of customer issues or tickets auto-resolved by the bot

Aimed at cost-conscious customers wary of paying for unproven AI

Aligns price with realized automation benefit; part of a broader industry shift from per-agent pricing to value-delivered pricing in support

Chargeflow Fintech (Chargeback Management)

Charges a fraction of recovered funds on chargebacks

Example: ~25% fee per successful chargeback recovery

No fees for chargebacks lost

Alert service charges $39 per prevented chargeback

Value/count of chargebacks recovered (disputes won) and chargebacks prevented (for prevention alerts)

4× ROI guarantee on recoveries

No contracts or monthly fees

Revenue comes only from successful outcomes; pricing directly aligns with merchant's regained revenue, meaning Chargeflow only profits when the client does (win-win model)

Riskified*

(source: https://www.chargeflow.io/blog/riskified-vs-forter)

E-commerce Fraud Prevention

remain fraud-free

PAYGO, 0.4% per transaction

Only charges for transactions it approves that

Number or value of approved transactions without fraud (i.e. successfully processed legitimate sales).

Provider shares financial risk of fraud with clients; pricing tied to outcome of increased safe sales

Incentivizes vendor to maintain high accuracy (they only profit when fraud is stopped)

Foster continuous improvement in their fraud-detection algorithms

Subscribe

Keep up with the latest in
pricing and packaging