>_ Analyst Engineering

Git for Analysts: Get Into the Codebase

Cover for a guide on Git for analysts, getting into the codebase to read it without breaking anything.

Git is how code is stored and versioned, and the code is the most honest description of a system there is. A handful of read-oriented Git skills lets an analyst get into the codebase, see how a requirement was really built, and find exactly when behavior changed, all without breaking anything.

The Git an analyst needs is the Git that reads: clone a repository to get a copy, navigate the files, read a diff to see what a change did, browse the history to find when behavior changed, and review pull requests as a peer. That small set gives you direct access to the most truthful artifact in any system, the code itself, which is the only fully honest description of how a system behaves. When the documentation is wrong and the diagram is stale, the code is what actually runs, and Git is the door into it. Crucially, reading code through Git is completely safe: cloning, viewing history, and reading diffs cannot affect the shared codebase, so you can explore freely. This is a foundational developer analyst skill that unlocks the others.

I am not committing production features, but I am in the codebase regularly, reading how something was implemented, checking a diff to understand a fix, finding when a behavior changed and why. That access has done more for my credibility than any certificate, because it means I can verify how the system works rather than relying on someone to tell me. The full progression through these technical skills is in The Technical Skills Guide for BAs.

What Git does an analyst actually need?

An analyst needs the read-oriented Git commands, the ones that let you get a copy of the code and explore it, not the ones that change shared history. Six commands cover almost everything: clone, status, log, diff, show, and blame, with branch and checkout for navigation.

Clone makes a local copy of a repository, your own sandbox to explore. Status shows the current state of your local copy. Log shows the history of commits, who changed what and when, with the messages explaining why. Diff shows the changes between two versions, the heart of understanding what a change did. Show displays a specific commit, its message and its diff together. Blame annotates each line of a file with the commit that last changed it, so you can find when and why a particular line came to be. Branch and checkout let you move between versions of the code, the current release, a feature branch, a past state.

Notice that none of these change the shared repository. The commands that do, push, merging into shared branches, rewriting history, are the ones you do not need for reading, which is why analyst Git use is inherently safe. You are working in your own local clone, looking, and nothing you do there touches what anyone else sees. That safety is worth internalizing, because the fear of breaking something is what keeps many analysts out of the codebase, and it is unfounded for reading. The same self-sufficiency principle runs through SQL and reading API contracts: learn the focused, safe subset that lets you verify reality directly.

How do you read a diff?

Read a diff by recognizing that it shows the difference between two versions of a file: removed lines are marked with a minus, added lines with a plus, and unchanged surrounding lines give context. Once you can read a diff, you can understand exactly what any change did to the code.

A diff is organized by file, and within each file by the regions that changed. The minus lines show what was there before; the plus lines show what replaced them; the context lines around them anchor you to where in the file the change sits.

  function validatePayment(p) {
-   if (p.amount > 0) {
+   if (p.amount > 0 && p.currency in SUPPORTED) {
      return accept(p);
    }
-   return reject(p, "AC01");
+   return reject(p, p.amount <= 0 ? "AM01" : "AC04");
  }

Reading that diff tells you precisely what changed: the validation now also checks the currency, and the rejection code is no longer always AC01 but depends on the failure. That is exactly the kind of behavioral detail an analyst needs, and it is right there in the diff, no developer required to explain it. This is enormously useful in practice: when a bug is fixed, the diff shows you what the fix actually changed, which often reveals the real cause and any side effects; when a feature lands, the diff shows what behavior is new. Reading diffs connects directly to reason code mapping and functional analysis, because the codes and rules you specify are visible in the code that implements them.

How does history help you find when behavior changed?

Git history lets you answer “when did this change, and why,” which is one of the most valuable questions an analyst can ask. The log and blame commands turn the codebase into a searchable record of every change and its rationale, so a mysterious behavior can be traced to the exact commit that introduced it.

The scenario is common: something behaves differently than expected, or differently than it used to, and nobody remembers why. With Git, you do not have to remember. The log shows the sequence of changes to a file or area, each with a message explaining the intent. Blame shows, for a specific line, the commit that last touched it, so you can find when a particular rule, code, or value came to be and read the reasoning in that commit’s message. Together they let you reconstruct the history of a behavior: this validation was added in this commit, for this reason, on this date, by this person.

This is detective work that would otherwise be impossible. I have used Git history to settle debates about whether a behavior was intentional, find the commit that introduced a regression, and understand why a seemingly odd rule exists, the message explained a production incident that prompted it. That context turns “the system does this strange thing” into “the system does this because of that incident, and here is the requirement we should write.” It is the same impulse as testing the system to learn how it really behaves, applied to its history rather than its runtime. The code and its history together are the honest record, and Git is how you read it.

How do you start, and what about pull requests?

Start by cloning a repository you work near and just browsing it, then practice reading a recent change’s diff and history. Pull requests are the natural next step, because they are where changes are proposed and discussed, and an analyst can add real value by reviewing them.

Get read access to a repository for a system you analyze, clone it, and explore the files to get a feel for how the code is organized. Then find a recent change, read its diff to see what it did, and look at the log around it. Once that is comfortable, look at open pull requests, the proposed changes awaiting review on a platform like GitHub. A pull request shows the diff of a proposed change plus the discussion around it, and reading them keeps you current on what is changing and why. An analyst who reviews pull requests can catch when a change affects a requirement, spot a behavior that needs a spec update, or simply understand upcoming changes before they ship.

This is where the distinction between Git and GitHub becomes practical: Git is the underlying version control, and GitHub (or GitLab) is the platform that adds pull requests, issues, and a friendly web interface on top. Much of your reading can happen through that web interface, browsing files, diffs, and history visually, which is an easy on-ramp before you are comfortable at the command line. Participating in pull request review as a reading peer is one of the clearest signals that you have crossed from an analyst who waits for information to a technical business analyst who goes and gets it. The full set of skills that get you there is in The Technical Skills Guide for BAs.

The takeaway

Git is how code is stored and versioned, and the code is the most honest description of a system there is. A handful of read-oriented commands, clone, log, diff, show, blame, lets an analyst get into the codebase, see how a requirement was really implemented, read what a change did, and trace exactly when and why behavior changed, all completely safely, because reading cannot affect the shared repository. Pull request review is the natural extension, where you stay current and add value as a reading peer.

Get into the codebase, read the diffs and the history, and the code stops being a black box only developers can interpret. Start with The Technical Skills Guide for BAs, or browse everything at The Tech BA Toolkit.

Ahmed is a Senior Technical Business Analyst with 10+ years in banking and payments. He builds practical guides and tools for analysts at The Tech BA Toolkit.

Tags: Git, Version Control, Technical Skills, Software Development, Career Growth

Newsletter

Subscribe

Practical, no-fluff playbooks for technical analysts who analyze, code, test, and support. New articles straight to your inbox.

No spam. Unsubscribe anytime.