Git Merge Strategies
git merge --no-ff --edit
git merge --squash
git rebase --interactive
git rebase --onto
Imagine I have a
master branch with one commit:
75eb1cb - (origin/master) README
This is a single
README.md file with the following content:
- A: 1
Now imagine I have a branch from
feat/foo and in that branch I've made 3 additional commits:
* 41d4115 - Add C (also revert A) * 9e5626c - Modify A * 8e7965e - Add B
The contents of the
README.md file is now:
- A: 1 - B: 2 - C: 3
Just to quickly clarify, you'll notice throughout this post that I use the command
git lg which is actually an alias I have set within my
~/.gitconfig that uses
git log but modifies its behaviour with some additional git flags:
log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr)%Creset' --abbrev-commit --date=relative
git merge is the standard workhorse for merging branches in git. It'll try to resolve the differences between the two branches the best way it can.
If the source branch
feat/foo (the branch you want to merge from) can be merged cleanly (e.g. there are no major diverges from the destination branch
master, which is the branch the changes are being merged into), then git will be able to perform a simple "fast-forward".
What "fast-forward: means is that git will change the
HEAD (on the destination branch) to point to the new latest commit, and all the other commits from your source branch will also appear in the git log/history of the destination branch.
HEADis an alias that points to a commit (typically
HEADis the latest commit in your branch). Even the branch name itself is an alias that refers to a commit (most things in git do simply resolve to commits). This is why when you have a long branch name, instead of
git push origin really-long-branch-nameyou can just use
git push origin headand git will figure out which branch you're on
If you check
git lg after doing a
git merge feat/foo, you should see something like:
* 41d4115 - (HEAD -> master, origin/feat/foo, feat/foo) Add C (also revert A) * 9e5626c - Modify A * 8e7965e - Add B * 75eb1cb - (origin/master) README
We can see all the commits from
feat/foo were replayed onto
Note: you might not realise that there is a short cut to checking out a branch and then merging another branch into it:
git merge <source> <destination>, which is the same as doing
git checkout <destination>followed by
git merge <source>
git merge --no-ff --edit
Let's say you wanted a "merge commit" to happen (i.e. merge commits typically only occur if there has been a divergence between the branches which means git has to resolve the problem for you), then you can force git to use a "merge commit" even when there is no need for one (as is the case for me here).
Using our previous example, which merged cleanly, let's say that a merge commit is what we wanted to have happen. Assuming you've not pushed the branch to a remote, then you can safely go back to before the merge occurred using:
git reset --hard 75eb1cb
75eb1cbbeing my first commit in
It's important to understand how
git reset works, as it has three flags and if not used correctly could have bad side effects. The flags are:
The way reset works is that you use one of the above flags, followed by the commit you want to reset the
HEAD back to. So in our case we used the commit
75eb1cb, which was our very first commit.
If I had used the
--soft flag instead, then it would have reset the
HEAD back to the first commit, but any other commits that happened since would have their changes staged together in our git 'index' waiting to be committed.
If I had used the
--mixed flag instead, then it would have reset the
HEAD back to the first commit, but any other commits that happened since would have their changes applied to the working directory, ready for us to choose which changes to be added to the index (i.e. staged) and then finally committed.
--hard though, any of the changes that came after the commit being reset to, are lost. They're not sitting in your staging index, nor are they available within your working directory either.
So be careful whenever using the
Force the merge commit
Now we're back to where we were originally (a separate
feat/foo branch and a
master branch with a single commit), we can look at how to force a merge commit.
To force a merge commit you'll need to use the
--no-ff flag and then also use the
--edit flag to allow you to modify the default merge commit message (otherwise git will provide its own commit message which is nearly always not useful or descriptive):
git merge --edit --no-ff feat/foo
--editdoesn't work without
--no-ff, unless there is a genuine merge conflict
Now if I look at my
git lg I can see:
* 97f1257 - (HEAD -> master) My custom merge commit message for 'feat/foo' |\ | * 41d4115 - (origin/feat/foo, feat/foo) Add C (also revert A) | * 9e5626c - Modify A | * 8e7965e - Add B |/ * 75eb1cb - (origin/master) README
We can see all the commits from
feat/foo were replayed onto
master successfully, but now you're able to more easily distinguish the three commits came from another branch (if using my
git lg alias). Which is one of the main reasons to force a merge commit using
--no-ff as it really helps keep a varied branch history.
git logwill also show in its output for the merge commit
a field like
Merge: 75eb1cb 8e7965e 9e5626c 41d4115
Which helps (at a glance) to know more about what commits are inside the merge commit
git branch --contains
The following command can be useful in locating where a commit has come from:
git branch --contains 9e5626c
In our case this will indicate that the commit we specified is part of our
master branch. Now when you use
--contains with a commit such as
9e5626c (which was merged in from our feature branch) you'll see that git recognises this commit is part of multiple branches †.
† until you delete the branch (e.g.
git branch -D feat/foo)
Losing useful history
It's also worth mentioning, that even after the
feat/foo branch has been deleted, git will still show (via
git log --graph) those commits from our
feat/foo branch as coming from an alternative path/branch history.
This is a useful bit of information that can be lost when using other tools such as
git rebase or
git merge --squash, so you should discuss with your team what type of information you feel is useful to have when you look back at a project's git history before forging ahead with any one of the strategies I cover here.
For example, some teams don't find being able to see that a set of commits actually came from another branch very useful: considering all commits/features should generally come in from separate branches/Pull Requests. So the use of rebase or squash isn't a concern for them. For a team like this, an aesthetically 'cleaner' git commit history is preferred.
Also, in teams where I've worked and they've utilised a 'squash' strategy (see below for more details), we've used the following structure for our commit message so it's clearer what's been squashed:
Closes #123 - New Feature X Squashed commit of the following: commit c7e4145f6e95e51fcff79d6b3476bcb19c058071 commit 3275f1805c4f82298676aa3c61db8c65ee9f3428 commit bb50fb69c2d131d0126fa9ae018377e6451678e2 commit 7ceb49c352d812a91db0e87a8ed4c4cf426c0365 commit 86d1de3c5133a403edf45343081353055c02b454 commit 8f48e5b3c43acf71e8abab4b821cfdc66447b732 commit ed857784feff091ece52d906e311ef7f64a49c3d commit a277e60c39333a55134c3e3ef6d97076f9bc8370 commit dd7e1973fe91f29887928aad9d991be24efb143a commit ff7e7dabf745ac4d73b52644c3d29ea05d5c318f commit 36f1c5bc5949f01117c1d57e6ab12f05c2a202f5
git merge --squash
So what if you don't want all those commits in your
master? You could instead "squash" all the commits down into a single commit using the
git merge --squash feat/foo
Now what this does is take my changes from the source branch
feat/foo and automatically squashes those separate commits into a single change that's placed into the staging area of my destination branch.
These collection of changes now appear as a single change to the file. They aren't actually merged yet. So you have the opportunity to change the commit message:
git commit -m "your own custom commit message"
git rebase feature in essence is solving the same problem as
git merge (they both integrate a set of changes), but they do them in fundamentally different ways.
git merge a merge commit is utilised to resolve conflicts and so is considered non-destructive. What this means is that the commits within either branch (destination or source) aren't modified in any way.
git rebase the source branch commits are placed before the destination branch's commits, but also the commits themselves are from the source are recreated inside the destination branch.
Let's look and see what this does for us:
git rebase feat/foo
We can see that as there were no conflicts, git was able to "fast-forward" the commits. So in theory this is no different right now from originally doing
git merge feat/foo.
But what if
master had a new change committed to it, and this change happened after we had branched off with
feat/foo? For example, I'll add a second commit to
master that changes
- A: 1 to
- A: 9.
If I run
git rebase feat/foo I should see we get a merge conflict and one that git doesn't know how to resolve:
First, rewinding head to replay your work on top of it... Applying: A to 9 Using index info to reconstruct a base tree... M README.md Falling back to patching base and 3-way merge... Auto-merging README.md CONFLICT (content): Merge conflict in README.md error: Failed to merge in the changes. Patch failed at 0001 A to 9 The copy of the patch that failed is found in: .git/rebase-apply/patch When you have resolved this problem, run "git rebase --continue". If you prefer to skip this patch, run "git rebase --skip" instead. To check out the original branch and stop rebasing, run "git rebase --abort".
We can see from the information git has given us that it first rewinded
master back to the first commit
75eb1cb in order for it to place our
feat/foo commits on top of it (as that initial commit is where our branch originally forked from).
From there we can see once git replayed our
feat/foo commits on top of
75eb1cb that it then tried to apply the new commit that
feat/foo didn't have (e.g.
Applying: A to 9) and it failed to do so.
Git tells us that there was a merge conflict:
CONFLICT (content): Merge conflict in README.md
It's up to us to open
README.md and to resolve the conflict ourself. When I open the file I see:
<<<<<<< 41d411564c1dc3106f03427d1b5920d05d95e037 - A: 1 - B: 2 - C: 3 ||||||| merged common ancestors - A: 1 ======= - A: 9 >>>>>>> A to 9
So the above shows the file is split into three:
||||||| merged common ancestors
I know that I'm happy for the line
- A: 1 (which was changed in my
feat/foo branch commit
41d4115) to be changed to
- A: 9 (which was changed in
master after I originally branched from it). So I manually make that change by deleting all the added noise (e.g.
||||||| merged common ancestors etc) so I'm left with just the content the file should be expected to have now.
I update it to look like:
- A: 9 - B: 2 - C: 3
I now must run the following commands:
git add README.md(as I've made a change to the file at this point in time)
git rebase --continue
We see that git is trying again now to apply the commit (but this time there is no merge conflict info inside of the README) and so we see the output:
Applying: A to 9
Now when looking at the output from
git lg I see:
* 7c001cd - (HEAD -> master) A to 9 * 41d4115 - (origin/feat/foo, feat/foo) Add C (also revert A) * 9e5626c - Modify A * 8e7965e - Add B * 75eb1cb - (origin/master) README
This shows that the changes from
feat/foo where replayed directly on top of
75eb1cb. Otherwise if we didn't use git's rebase feature but a standard
git merge, we could've ended up with a git history that looked like the following:
* 41d4115 - (origin/feat/foo, feat/foo) Add C (also revert A) * 9e5626c - Modify A * 8e7965e - Add B * 65553e0 - (HEAD -> master) A to 9 * 75eb1cb - (origin/master) README
feat/foo commits are on top of the
A to 9 commit and that might not necessarily be what we want to have happen.
git rebase --interactive
--interactive flag is useful for letting us rewrite our git history. We're able to move the order of our commits as well as squash commits down and change their recorded message.
So let's assume we want to squash all but the first commit in our
feat/foo branch. By that I mean we currently have:
* b4f9dfd - (HEAD -> feat/foo) Add C (also revert A) * 7354a41 - Modify A * c321b40 - Add B * 75eb1cb - (origin/master) README
Let's say we want "Add B", "Modify A" and "Add C (also revert A)" squashed into one commit. To do this we need to locate the parent commit of the earliest commit we want to squash.
So "Add B" is the earliest commit we want as part of the squash, so the parent commit is "README". To action the rebase let's run the following command:
git rebase --interactive 75eb1cb
This drops us into an editor with the following output:
pick c321b40 Add B pick 7354a41 Modify A pick b4f9dfd Add C (also revert A) # Rebase 75eb1cb..b4f9dfd onto 75eb1cb (3 command(s)) # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # x, exec = run command (the rest of the line) using shell # d, drop = remove commit # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented out
We can modify it like so:
pick c321b40 Add B squash 7354a41 Modify A squash b4f9dfd Add C (also revert A)
This will result in the following combined commit details:
# This is a combination of 3 commits. # The first commit's message is: Add B # This is the 2nd commit message: Modify A # This is the 3rd commit message: Add C (also revert A) # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # # Date: Sun May 15 17:29:32 2016 +0100 # # interactive rebase in progress; onto 75eb1cb # Last commands done (3 commands done): # squash 7354a41 Modify A # squash b4f9dfd Add C (also revert A) # No commands remaining. # You are currently editing a commit while rebasing branch 'feat/foo' on '75eb1cb'. # # Changes to be committed: # modified: README.md #
Now if we run
git lg -p we'll see the new squashed commit does indeed contain all the previous commit's contents:
* b63857d - (HEAD -> feat/foo) Add B (16 minutes ago)| | diff --git a/README.md b/README.md | index 428f59e..f2e26b6 100644 | --- a/README.md | +++ b/README.md | @@ -1 +1,3 @@ | - A: 1 | +- B: 2 | +- C: 3
git rebase --onto
Imagine we've merged our
feat/foo branch at this point into
git merge --squash feat/foo
Note: you'll need to fix a conflict first for it to be successful
master should now have three commits:
* 19ec1bb - (HEAD -> master) Merge feat/foo * 3fc460b - A to 9 * 75eb1cb - (origin/master) README
What's the easiest way to delete the middle/second commit
3fc460b? We could use
git rebase --interactive to delete the commit from history, but there is an alternative that's much easier:
git rebase --onto 75eb1cb 3fc460b
Note: in this scenario you'll get a conflict that you'll need to resolve first (e.g. we're removing a commit that sets A to the value 9 but that change was also pulled into the
feat/foobranch so git isn't sure whether you definitely want that change any more or not), but in most cases you'll likely have a clean rebase
The basic structure of this command is:
git rebase --onto <commit_to_become_new_base> <commit_to_delete>
For more information see the documentation for
At this point you're likely using a service such as GitHub or GitLab for creating projects and opening pull requests, as apposed to Git's own native pull request feature which is substantially less feature rich than these commercial abstraction layers.
But sometimes just accepting a 'patch' from someone and being able to apply it quickly and easily is what you want to do. So that's where
git format-patch comes in.
Imagine you have a centralised
master branch and someone has branched off from its
HEAD to a new branch called
cool-new-features and they would like you to merge their changes directly with the centralised repository's
This person would need to execute the following command:
git format-patch master
Note: you can swap the branch
masterfor any valid commit, alias or range
What this will end up doing is generating a 'patch' file for each new commit that isn't available in master. Below is an example patch file generated from a test repo I was messing around with, and which actually generated two patch files for me (this being the first one):
From 64a903d2ed6b4280d4a0914aaf50f014ae05cdd3 Mon Sep 17 00:00:00 2001 From: Integralist <email@example.com> Date: Tue, 31 May 2016 08:28:56 +0100 Subject: [PATCH 1/2] G --- foo.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/foo.txt b/foo.txt index b1e6722..6f04b1d 100644 --- a/foo.txt +++ b/foo.txt @@ -1,3 +1,4 @@ A B C +G -- 2.7.4
Note: if you want a single patch file you can use
--stdoutflag and redirect the output to a file
git format-patch master --stdout > new-feature.patch
The person who generates the patch file(s) will then need to send them to you (which can be done using
git send-email -to firstname.lastname@example.org 0001-A.patch
If it's you sending the patch via git, then you may need to configure git to use your mail server details:
git config --global sendemail.smtpserver smtp.my-isp.com git config --global sendemail.smtpserverport 465
You, as the recipient of the patch file(s), can then review and apply the patch using:
git checkout review-new-feature cat new-feature.patch | git am # single patch file cat *.patch | git am # multiple patch files
Also of interest, if using GitHub for Pull Requests, is that you can add a
.patch extension to the end of a PR path or commit path for it to generate a patch for you! So you can utilise GitHub for some of the nice 'review' features, but then utilise classic/traditional communication and application of patches if you so choose (maybe for an older/internal system).
So if you have a GitHub PR URL like
https://github.com/my-org/my-repo/pull/123, then you can convert this into a patch file using
Git also offers you the
git apply command to use in place of
git am. The reason being is that
git am actually commits the changes in the patch, where as
git apply will only affect your working directory, so you'll have the opportunity to stage and commit the changes however you like. Unless you use the
--index flags (see
man git-apply for details).
git applyalso has a
--reverseflag to manipulate the order when applying multiple patchess
The other difference is that
git am only accepts patch files, where as
git apply accepts patch files and also output from
git diff. So you have more options available to you that way. For example:
curl https://gist.githubusercontent.com/anonymous/x/raw/x/test.diff | git apply
There are so many aspects to merging commits and dealing with git's commit history, that it's difficult to cover everything without people having to mentally store too much information that most of the time you wont utilise.
For example, I've not covered anything to do with pulling commits:
git pull --strategy,
git pull --squash,
git pull --rebase,
git pull --ff-only and
git pull --no-commit. Each have their use cases, but I think sometimes you're better picking a single strategy and defining it as a standard within your development team.
If you're interested in one git workflow approach that utilises git's rebasing feature, and I've used with success in the past at the BBC, then I recommend you have a read of this blog post I wrote a few years ago: integralist.co.uk/posts/github-workflow
I've also written about other types of git "workflows" as part of BBC News' "Coding Best Practices" working group: github.com/bbc/news-coding-best-practices/git-workflow