You give your coding agent a task. It thinks for a minute, edits fourteen files, runs the tests, and declares victory. You look at the app — something that worked yesterday is now broken. Which of those fourteen changes did it? And how do you get back to the version that worked? If your answer is “I’m not sure,” you are missing the single most important tool for working with an AI agent: git, and a home for it on GitHub.
In the last post I argued you should stop copy-pasting code into a chat window and let an agent like Claude Code build directly in your project. This is the necessary companion to that: once an agent is editing your files for real, version control stops being optional.
The agent’s built-in undo is not enough — and its makers say so
Claude Code has a nice safety feature called checkpoints. Hit Esc twice or type /rewind and you can roll back to before the last prompt. It feels like a time machine. It is not. Read Anthropic’s own documentation and you find the warning in plain text: checkpoints are “not a replacement for git.” Specifically, they:
- Only track the agent’s file edits — not files changed by shell commands. If the agent runs
rm,mv, or a build script that rewrites things, that is invisible to the undo. - Miss edits you made by hand, or changes from another session running in parallel.
- Expire. They are cleaned up with the session, by default after 30 days. They are local scratch undo, not history.
Anthropic’s framing is exactly right: think of checkpoints as “local undo” and git as “permanent history.” The agent’s vendor is assuming you have the safety net. The whole point of this post is: actually have it.
Git is the time machine you can trust
You do not need to be a git expert. And to be clear about something most explainers blur: everything in this section is git — it runs entirely on your laptop, with no account and no internet. You need four ideas.
- A commit is a save point. Tell the agent to commit before and after any meaningful chunk of work and you get clean, labeled checkpoints that never expire and capture everything, including whatever a shell command did.
- A diff is a receipt.
git diffshows you exactly what changed, line by line. This is how you answer “which of those fourteen files did it touch, and what did it actually do?” — before you trust it. - Revert is the undo that always works. Bad change already committed?
git revertbacks it out while keeping the history honest. No 30-day expiry, no “oops, that was a bash command.” - A branch is a parallel universe. A branch is a separate copy of your project where the agent can try something bold without touching the version that works. Love the result? Merge it. Hate it? Delete the branch and your main line never knew. This is what lets you say “go try the risky refactor” and actually mean it. (Branches are pure git too — not a GitHub feature, though GitHub builds on them.)
The beautiful part: the agent drives git for you. You do not memorize commands. You say what you want:
Before you start, commit the current state so we have a clean restore point. Create a branch called try-new-auth and do your work there. When the tests pass, commit again with a clear message describing what you changed and why.
And when it goes sideways:
That broke the login page. Show me the git diff since your last commit so we can see exactly what changed, then revert it and try a different approach.
Why GitHub, then, if git already does all that?
Good question — and worth being precise about, because people muddle it constantly. Commits, diffs, reverts, branches: all git, all living on your hard drive. GitHub is a remote home for that git repository, hosted on someone else’s servers. That one fact — “a copy that lives somewhere other than your laptop” — is where everything else follows from.
1. A backup that survives a dead hard drive
This is the plain, unglamorous reason, and the most important one. Push your repo to GitHub and your entire project — every commit, every branch, the whole history — lives safely on a server. Laptop stolen? Drive fails? Coffee meets keyboard? You walk to another computer, run one git clone, and you are back to exactly where you left off, nothing lost. The same mechanism means you can pick up the project from any machine, anywhere.
2. Private or public — your call
A repo can be private (only you and people you invite can see it) or public (the whole world can read it — how open source works). It is a single toggle, and free either way for personal projects. Keep your half-finished experiments private; flip a polished one public when you want to share it or show it off.
3. Invite people in and collaborate
Add collaborators to a repo and other people — teammates, a friend, a contractor — can pull the code, push their own changes, and review yours. This is where pull requests earn their keep: a PR lays a set of changes out as a reviewable diff with a summary so a human can read it, comment on specific lines, and approve before it merges. That is exactly the gate you want for AI-written code, whether the reviewer is a teammate or future-you.
4. A launch pad for deploying to remote servers
If you are hosting your app somewhere — a web host, a cloud server, a platform like Vercel or Netlify — GitHub is usually the thing they deploy from. You connect the host to your repo, and from then on pushing to your main branch can automatically build and ship the new version. Your agent finishes a feature, you review and merge the pull request, and the live site updates itself. The repo becomes the single source of truth for what is running in production.
And a second set of superpowers for your agent
Because GitHub is the shared, online home of your code, it is also where AI tooling plugs in. None of this is possible with a repo that only exists on your laptop.
The pull request: the review gate where you catch the plausible-but-wrong
This is the big one. AI agents are fast and confident, and confidence is not correctness. The classic failure is code that looks right, passes a quick glance, and quietly mishandles an edge case. The pull request is where you catch it — the agent’s entire change, laid out for inspection, sitting on a branch where it cannot hurt anything until you say so. Better still, Claude Code writes the PR for you — title, summary, and a test checklist — using the gh command line tool. Your job shrinks to the one thing only you can do: review and approve.
@claude: put the agent inside GitHub
Install the official Claude GitHub App (the easiest way is to run /install-github-app right inside Claude Code — you need to be a repo admin) and you can summon the agent without opening your terminal at all. Mention @claude in any issue or pull request comment and describe what you want:
@claude this issue describes a bug in the date parser. Please find the cause, fix it, add a test, and open a pull request.
It spins up on GitHub’s own runners, makes a branch, does the work, and opens a PR for you to review. Your bug tracker becomes a task queue the agent can pull from. (It is @claude, not /claude — a surprisingly common reason people think it is broken.)
Automatic code review on every PR
Anthropic also offers a managed Code Review service (currently a research preview on Team and Enterprise plans) that watches your pull requests and posts inline comments on the exact lines it is worried about — flagged by severity, and deliberately non-blocking so it never stops a merge on its own. You can also trigger it on demand by commenting @claude review on a PR. Even without it, the local /code-review command in Claude Code will review your current changes before you ever push. Either way: a second set of eyes on AI-written code, which is exactly where you want one.
GitHub Actions: make the robot prove it works
The best practice for any agent is to give it a way to verify its own work. GitHub Actions runs your tests automatically every time a PR is opened or updated — independently of whatever the agent claimed on your laptop. “The tests pass” becomes a fact checked by a neutral machine, not a sentence the agent typed. If they fail, you (or the agent) see it on the PR before anything merges.
This is where the whole industry is converging
Here is the tell. Claude Code, GitHub’s own Copilot coding agent, and the others have all settled on the same shape: the agent works on a branch, its output lands as a pull request, CI verifies it, and a human reviews and approves. Copilot will not even let CI run on its changes until a person signs off. When every serious AI coding tool independently lands on git branches and PRs as the control plane, that is not a coincidence — it is the consensus answer to “how do humans stay in control of a fast, tireless, occasionally-wrong machine editing their code.”
Getting started, concretely
- Make a free GitHub account and create a repository for your project (private is fine).
- Install the
ghCLI from cli.github.com and rungh auth login. Anthropic specifically recommends this — Claude knows how to useghfor issues, branches, and pull requests, and without it the agent hits API rate limits fast. - Let the agent wire it up. Open Claude Code in your project and say: “Initialize git here, make the first commit, and push it to my GitHub repo at <url>.”
- Adopt the rhythm: commit before risky work, branch for anything experimental, open a PR, read the diff, then merge. Let the agent run every step — you stay the reviewer.
- (Optional) Run
/install-github-appto unlock@claudementions in issues and PRs.
An AI agent without version control is a power tool with no off switch. Give it git, give it a home on GitHub, and the same speed that felt reckless becomes something you can actually trust — because every change is visible, every step is reversible, and nothing reaches your main branch until you say so.
— Ben