Read the Git Log, Not the Code: A Better Way Into Unfami...

What Happened

A blog post by Adam Piechowski titled "The Git Commands I Run Before Reading Any Code" hit the top of Hacker News with over 1,500 points — a score that puts it in rare company for a post about, of all things, git commands most developers already have installed.

The premise is deceptively simple: before opening a single source file in an unfamiliar repository, run a sequence of git commands that reveal the codebase's shape, ownership, and recent trajectory. The argument isn't that these commands are obscure — it's that almost nobody uses them systematically as a *first step* before reading code.

The post walks through a specific workflow: start with `git log --oneline --graph` to see the branching structure and release cadence. Use `git shortlog -sn` to identify who actually maintains this thing (spoiler: it's usually 2-3 people regardless of the contributor count on GitHub). Run `git log --since="3 months ago" --stat` to find the hot files — the ones that are actively changing. Then `git blame` on those hot files to understand who's driving current changes and why.

Why It Matters

The explosive HN response tells you something: this post articulated a workflow that experienced developers do intuitively but have never written down. It's the difference between a junior developer who opens `src/` and starts reading top-to-bottom, and a senior developer who spends 15 minutes with git history before touching anything.

The core insight is that a codebase's git history is a higher-signal source of truth than its current state. Code tells you *what* exists. History tells you *what matters* — what's actively maintained, what's being refactored, what was written once and never touched again. In a 500-file repository, maybe 30 files account for 90% of recent commits. Those are the files worth reading first.

This maps to a well-known principle in software archaeology: the files that change together often belong together, regardless of what the directory structure implies. `git log --all --follow -- path/to/file` reveals these hidden couplings. Two files in completely different directories that always appear in the same commit? There's an implicit dependency the architecture diagram doesn't show you.

The Hacker News discussion surfaced several extensions to the basic workflow. Multiple commenters recommended `git log --diff-filter=D --summary` to find deleted files — understanding what was *removed* reveals architectural decisions and abandoned approaches. Others pointed to `git log --format='%H' --diff-filter=A -- '*.config'` to trace when configuration files were introduced, which often marks major architectural pivots.

The `shortlog` insight deserves special attention. Running `git shortlog -sn --no-merges` on any repository instantly reveals the bus factor. If one person accounts for 70%+ of commits, you've just identified both the most important person to talk to and the project's single point of failure. Several HN commenters noted that this single command has saved them weeks of asking around when joining a new team.

The AI Coding Angle

There's a timely subtext to this post's virality. In an era where AI coding assistants generate large volumes of code, understanding the *narrative arc* of a repository matters more than ever. When significant chunks of code may have been AI-generated, the commit history — specifically the human decisions about *what* to generate and *when* — becomes the primary record of intent.

A `git log` with `--format='%an %s'` filtered to the last month tells you what problems the team is actively solving. That's more useful context for an AI assistant prompt than any amount of code reading. If you're about to ask Copilot or Claude to modify a module, knowing that module has had 47 commits in the last two weeks (it's being actively reworked) versus 0 commits in the last year (it's stable but possibly abandoned) fundamentally changes your approach.

Several commenters extended this to code review workflows: before reviewing a PR, run `git log --oneline main..feature-branch` to understand the commit sequence, then `git diff --stat main..feature-branch` to see the blast radius. The diff stat alone — how many files changed and by how many lines — is often more informative than reading the diff itself for an initial assessment.

What This Means for Your Stack

The practical takeaway is to formalize this into a script or shell alias. Here's the workflow distilled:

1. `git log --oneline --graph -20` — See the shape: is this trunk-based? Feature branches? Release tags? 2. `git shortlog -sn --no-merges --since="1 year ago"` — Who actually works here? Who should you ask questions? 3. `git log --since="3 months ago" --pretty=format: --name-only | sort | uniq -c | sort -rn | head -20` — The 20 hottest files. Read these first. 4. `git log --diff-filter=D --summary --since="6 months ago"` — What was deleted? These are architectural decisions. 5. `git blame ` — For the hottest files, understand who wrote each section and when.

This sequence takes about 5 minutes and replaces hours of unfocused code reading. It's the difference between walking into a library and reading shelves left-to-right versus checking the circulation desk to see what's actually being borrowed.

For team leads, this also suggests an onboarding practice: instead of pointing new hires at the README (which is probably outdated), point them at this git workflow. The repository's history is always up to date — it *is* the ground truth of what the team has been doing.

Looking Ahead

The 1,500-point HN score for a post about basic git commands is itself a signal. It suggests the industry has a gap between tools we have installed and workflows we actually use. Git's command set is enormous, but most developers use maybe 6 commands daily. Posts like this succeed because they bridge the gap between "I know `git log` exists" and "here's why it should be the first thing you run, every time." As codebases grow larger and more AI-assisted, the metadata *about* code — who wrote it, when, why, and what changed alongside it — is becoming as valuable as the code itself. The developers who treat git history as a first-class information source rather than a version control tax will navigate unfamiliar codebases faster than those who just start reading.

Read the Git Log, Not the Code: A Better Way Into Unfamiliar Repos

// tldr

// viewpoints

// deep dive

What Happened

Why It Matters

The AI Coding Angle

What This Means for Your Stack

Looking Ahead

// read from source

The Git Commands I Run Before Reading Any Code

// community takes

Read the Git Log, Not the Code: A Better Way Into Unfamiliar Repos

// tldr

// viewpoints

// deep dive

What Happened

Why It Matters

The AI Coding Angle

What This Means for Your Stack

Looking Ahead

// read from source

The Git Commands I Run Before Reading Any Code

// community takes

// share this