Daniel's Blog

A Storm Troopers Guide to Navigating Large Legacy Codebases

What to do when you're dropped into presumed-hostile territory

The checklist

  1. Understand the land
  2. Move with a goal
  3. Leave it better than you found it

Understand the land

When approaching a legacy codebase, the first thing to do is scour as much information available possible. Skim through the documentation, have casual chats with people inside and outside the team to find out the following:

  1. What business verticals are there?
  2. How do they rank in priority for the company?
  3. Which one do you or your team support?
  4. How is it supported?
  5. What impedes you or your team from doing so?

Most guides advising how to navigate large codebases like this one, this one or this one often approach it from a technical perspective, but in my years of being orbital-dropped into legacy codebases in the backlines, my experience has been that the technicals are often the easiest to grok. What trips people is the weird business jargon, the overlapping, nonsensical business rules and the crazy architecture created to solve problems that may or may not exist. Finding the answers to the list above has helped me cut through the cruft quickly and efficiently by allowing me to build a mental model of what's core, what's not, relative to my area of work.

It also has the secondary benefit of generating a political landscape graph of who is in charge of what. You can't hope to remember everything that goes on in the application, especially when you've just started, so you need to find out who to ask when you need it.

After building an understanding of the land, you need a glossary to navigate it. A glossary is a map of the land. It may or may not exist. If it does not, you need build one.

Move with a goal

When wading through a codebase it's easy to get lost in the code and enter a state where you simultaneously understand and not understand what's going on in the codebase. Stuff like build steps, configurations, setting up a development environment and debugging anything that goes wrong is easier when you have a purpose in mind instead of mindlessly plouging through the code.

Try asking for a small task with someone to pair program with, and be prepare to spend 2x-4x the time needed compared to others in the same team on it while you go through the codebase and figure out how to:

  1. Configure and setup an environment for debugging and writing tests.
  2. Identify dead code and unused code paths.
  3. Figure out where and how data flows through the application.

Large codebases often comes up technical debt. While it is often good practice to clean up debt as soon as possible, business requirements often mean this is not possible. Instead, people learn to navigate around debt, and as a newcomer, these domain knowledge is not immediately apparent to you. Some challenges around this is duplicated code and defensive programming against team members.

Leave it better than you found it

It’s always easy to rant about how bad things are, but it’s more productive to use your experience and knowledge to provide a better solution whenever possible.

Providing documentation, adding comments, drawing graphs, fixing minor issues go a long way in improving the development experience for yourself and your team.


Recent posts