Squashing
What is "Squash and Merge" and why we use it
Last updated
Was this helpful?
What is "Squash and Merge" and why we use it
Last updated
Was this helpful?
GitHub offers three ways to merge pull requests:
Create a merge commit: Adds all commits from the feature branch into the main branch, interleaving them with other commits in chronological order
Rebase and merge: Takes all the commits and puts them on top of the latest commit on main
Squash and merge: Rebases, then combines the commits into one big commit before merging.
The notable difference of squashing is that it creates a new commit, one that doesn't correlate to any commit made by the original author, while also deleting the original commits of the author, along with all the important metadata that comes with them. That presents a very reasonable question: why? This document will attempt to explain why the benefits of squashing justify this seemingly odd behavior.
Let's revisit that handy gitlog
command from
Notice how the commits are all in a straight line. It's very simple to reason about this history: first, we made a Hello World. Then we moved the greeting message, realized it generated a cache, ignored the cache, and then added a comment followed by a README. If any of these steps introduce a bug or new behavior, we can figure out exactly which one it was pretty easily. This is called a linear history.
Let's see what a non-linear history looks like:
No merge conflict this time, but notice how the second merge was not a simple fast-forward. Why not? A fast forward only works when the branch you're merging already "knows" everything about your current branch (ie, it has all the target branch's commits). In this case, we're trying to merge branch2
which never "found out" about the merging of branch1
. Why doe this matter?
Our history started off linear, until we started merging branches like this. Then you have two branches coming off of Added a README
, which both merge back into one at Merge branch 2
. Now it's not a straight line, so we call it a non-linear history. Due to the nature of trunk-based development, feature branches from main
are going to appear all the time, and it won't take much for this history to grow.
The above merge was a typical merge, which corresponds to the first option provided by GitHub: both branches were interleaved with each other before converging together back into one branch. What if we handled this complexity inside the branch itself? Let's try this again:
Was that better? Yes and no. We still have the non-linear history, but now the complex merge happened in the feature branch, and the real merge into main was a simple fast-forward
At its core, the non-linear history arises because branch2
started making its changes before branch1
merged into main
. That problem is unavoidable, so we need a way to deal with it. Luckily, we can just pretend it never happened!
With rebasing, we can pretend that Added file 2
was written after branch1
was merged into main
, resulting in a linear history. Of course, this is not without its drawbacks:
Git has to rewrite every single commit on branch2
to fake the history. That means this is not a viable option for commits that have already been pushed to a remote branch
This doesn't magically avoid merge conflicts. In fact, it makes them worse, because conflicts have to be resolved while rewriting commits, meaning you might end up resolving the same commit again and again.
While not specific to rebasing, many feature branches will ultimately contain annoying commits like "Working now" or "Fixed typo". These don't belong on the main branch.
Squashing addresses these problems by waiting until the very end of a branch's life (ie, after reviews and tests have passed) to rebase and then combine all commits into one big commit before fast-forward merging.
Since rebasing happens immediately before merging, the rebased commits aren't pushed
Merge conflicts still have to be resolved, but they can be handled once per conflict
All commits from the feature branch are now represented by a single, useful commit
This approach maintains a very useful guarantee of trunk-based development: every commit on main
is a known good commit that can be used in production.