Las Vegas 2018

The Nth Region Project: An Open Retrospective

For the past year, a small team of engineers and I have had one job: allow New Relic to run an independent European region for data sovereignty reasons. That means taking around 500 services written by around 50 teams that have historically been assumed to run in just one deployment and changing them to work anywhere. And, at the end of the process, we needed to be able to spin up new regions quickly and sustainably operate them with our existing staff.

The talk will be in two parts, because a project like this isn't purely technical or organizational. We needed to choose technical changes that turned building out a new region from a many-month-long process for all teams into a project for one small team. We decided that the key was to move all services to run in containers, and have them all do service discovery via dependency injection. The reality of working at a medium-sized organization meant we had to have a lot of coordination and buy-in. I'll talk about how our roadmapping process both hindered and enabled this project to work at all, and how we used test buildouts and teardowns to integrate early and often.

This wouldn't be an open retrospective without talking about what didn't work well, which was primarily organizational rather than technical. We've learned some lessons on how to run large-scale projects that will hopefully help us on our next one, so I hope that we can provide some hard-earned lessons.

Andrew has worked on a wide range of projects, including the NRDB distributed event database, charting, the autocompleting NRQL query editor, bare metal hardware provisioning, and supporting multiple regions. He lives in Pittsburgh, Pennsylvania, USA, where he also sings classically in the Mendelssohn Choir of Pittsburgh.

Andrew Bloomgarden

Principal Software Engineer, New Relic