Git is a version control system widely used by organizations and millions of developers worldwide to securely and consistently manage their private codebases. In many cases, businesses adopt open-source projects for their infrastructure and library requirements. Given the convenience, it’s not an overstatement to say that we stand on the shoulders of giants.
In many cases though, depending on the open-source library, it’s not always enough. Oftentimes, the people who maintain those projects don’t have time to fix existing bugs or add new features right when we want them.
This process can cause friction in both the development and the incident management lifecycle. Eventually, we may find a bug in something that’s critical to our business and try to fix it locally by cloning the project and then adding the fixes, so we go ahead and add those missing features on top of a branch that we control.
This is the point where we could potentially end up introducing more problems – let’s dive into it.
The problem with keeping local and GitHub repositories in sync
The main problem we have when we clone an external open-source repository is that we have to maintain our own commit history when we add extra functionality. The commits we add are part of our separate history, so if the original repository ever merges different commits into the branch we’re using, then we’ll end up with a different lineage. So, there could be conflicts if we wanted to merge any updates or changes from the remote branch into our local branch.
To showcase what we mean, let’s simulate this problem by creating a repo and adding some changes:
1) Create a repo on GitHub using the UI – I named mine sync-example.
2) Clone the repo locally:
$ git clone [email protected]:theodesp/sync-example.git
3) Add a README.md file and add some text:
$ touch README.md $ echo "example" >> README.md $ git add . $ git commit -m”example” $ git push origin master
4) Once the remote repo is in sync in GitHub, go to the Project page and modify the README.md in place using the UI, and then click the Commit Changes button.
5) Now go back to the README.md locally and add a different modification to the same file, then try to push to master:
$ vi README.md $ git add . $ git commit -m "feature" $ git push origin master To github.com:theodesp/sync-example.git ! [rejected] master -> master (fetch first) error: failed to push some refs to '[email protected]:theodesp/sync-example.git' hint: Updates were rejected because the remote contains work that you do hint: not have locally. This is usually caused by another repository pushing hint: to the same ref. You may want to first integrate the remote changes hint: (e.g., 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.
Typically, the remote branch would be an external repo which we don’t own, so we wouldn’t be able to push to that repo. The external repo can be updated at any time. In our local repo, we want to have a set of changes (bug fixes or features) that are not yet available in the remote repo. Thus, our local git history would be like this:
$ git log commit __7801fa64f7d03719e0b39ac977015dfc24a33903__ (HEAD -> master) Author: Theo
Date: Mon Sep 16 16:52:29 2019 +0100 feature commit c2a6a4b4ec0da69a2e3d5c6351def0cccb3730a0 (origin/master) Author: Theo Date: Mon Sep 16 16:47:53 2019 +0100
But the remote history would be:
$ git fetch $ git log origin/master commit __404aeebe02d46da222b83848a26c9e4b432a7035__ (origin/master) Author: Theofanis Despoudis
Date: Mon Sep 16 16:48:45 2019 +0100 Update README.md commit c2a6a4b4ec0da69a2e3d5c6351def0cccb3730a0 Author: Theo Date: Mon Sep 16 16:47:53 2019 +0100 Example
I’ve highlighted the hashes of the two commits that are currently in conflict with each other. Git won’t merge those two because they’re unrelated, so we have to instruct git to do something else. (P.s. You can learn more about tracking service ownership with Git here.)
Let’s now take a peek at how we can solve those out of sync problems in the best way possible.
The solution(s) to out of sync GitHub and local repos
Let’s see how we can overcome those issues and keep all of our eggs in the basket. We have a few options, each with its own pros and cons. Let’s start with the less favourable one:
Technique 1: Manually updating the new commits on top of the latest changes:
When the remote repo gets ahead of other bug fixes that we need to include in our local repo, we need to create a patch file and save our commits in it. For example:
1) Create a patch with our latest changes:
$ git format-patch -1 HEAD 0001-feature.patch $ cat 0001-feature.patch From 7801fa64f7d03719e0b39ac977015dfc24a33903 Mon Sep 17 00:00:00 2001 From: Theo
Date: Mon, 16 Sep 2019 16:52:29 +0100 Subject: [PATCH] feature --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 33a9488..912f505 100644 --- a/README.md +++ b/README.md @@ -1 +1,4 @@ example + +- This is my local changes +- I added new features as well -- 2.20.1 (Apple Git-117)
2) Delete our repo and clone the remote repo again:
$ mv 0001-feature.patch ../ $ cd .. $ rm -rf sync-example $ git clone [email protected]:theodesp/sync-example.git
3) Check if we can apply the patch with our changes:
$ git apply --check ../0001-feature.patch error: patch failed: README.md:1 error: README.md: patch does not apply
4) Uh oh! We cannot apply the patch automatically since there are conflicts. Let’s use the –reject flag to let GitHub apply only the parts that have no conflicts:
$ git apply --reject ../0001-feature.patch Checking patch README.md... error: while searching for: example error: patch failed: README.md:1 Applying patch README.md with 1 reject... Rejected hunk #1.
5) Finally, inspect the README.md.rej file and apply the other changes manually:
$ cat README.md.rej diff a/README.md b/README.md (rejected hunks) @@ -1 +1,4 @@ example + +- This is my local changes +- I added new features as well $ vi README..md // … after fixes $ cat README.md example - Made some modifications here - This is my local changes - I added new features as well
Technique 2: Using rebase
As you can imagine, performing all of those steps every time we need to sync with the remote branch would be very cumbersome and prone to errors. Luckily for us, we can do it more quickly and with more guidance using the rebase command. Let’s follow the steps again:
1) Reset our current branch to our previous changes:
$ git reset HEAD~1 Unstaged changes after reset: M README.md $ git apply --reject ../0001-feature.patch Checking patch README.md... Applied patch README.md cleanly. $ git add README.md $ git commit -m "Our local Features"
2) Rebase the remote branch into our local branch:
$ git rebase origin/master First, rewinding head to replay your work on top of it... Applying: Our local Features Using index info to reconstruct a base tree... M README.md Falling back to patching base and 3-way merge... Auto-merging README.md CONFLICT (content): Merge conflict in README.md error: Failed to merge in the changes. Patch failed at 0001 Our local Features hint: Use 'git am --show-current-patch' to see the failed patch
Resolve all conflicts manually, mark them as resolved with
You can instead skip this commit: run “git rebase –skip”.
To abort and get back to the state before “git rebase”, run “git rebase –abort”.
3) Good news! We’re now at step 5 of our previous way of handling conflicts. Now, we just fix them and run git rebase –continue:
$ cat README.md example <<<<<<< HEAD - Made some modifications here ======= - This is my local changes - I added new features as well >>>>>>> Our local Features ~ $ vi README.md // … after fixes $ cat README.md example - Made some modifications here - This is my local changes - I added new features as well $ git add README.md $ git rebase --continue
Eventually, rebasing does the same thing as before, but now it gives us a more streamlined approach and a better UI. We also could have used cherry-picking if we wanted to apply a specific commit in our history:
$ git cherry-pick __404aeebe02d46da222b83848a26c9e4b432a7035__ error: could not apply 404aeeb... Update README.md hint: after resolving the conflicts, mark the corrected paths hint: with 'git add
' or 'git rm ' hint: and commit the result with 'git commit' sync-example on (git)-[master|cherry]- [⇕=] $ cat README.md example <<<<<<< HEAD - This is my local changes - I added new features as well ======= - Made some modifications here >>>>>>> 404aeeb... Update README.md
Rebasing is suitable if we have two or more commits to apply on top of the remote branch and we want to keep the history in that order.
The Commit Strip
(Image Source: CommitStrip)
Having to maintain a local Git repository in sync with a public open-source project is not ideal, since it substantially increases the technical debt of our codebase. In some cases where we have to offer extra functionality not provided by the original library, we have a few options that we can implement, as we described in this tutorial.
However, it doesn’t have to be that way – we can always contribute to the community by fixing bugs and adding those features that we implemented in our repository. Ultimately, actively participating in popular open-source projects will not only give us more credibility and respect in the community, but also a chance to share our values with the world.