Git Prune: Check Out this Housekeeping Utility

0
1259

We carry out a commit after every small change we make in the source code to ensure that all the changes are reserved. However, when we make the final push to master the code, there may be several commits we need to ignore or detach. We often have unreachable or orphaned Git objects, and they need to be cleaned up. Git Prune helps with that.

As we know, git is very careful about deleting data, and does not lose any commits easily. The downside to this is that lots of stale data is reserved even if it’s not going to be used. One way to clean this data is by using prune.

Git prune
The git prune command is a housekeeping utility that is used to clean up unreachable or orphaned git objects. Unreachable objects are those that are not accessible by any refs. As an example, say you have made two commits, and reset them to different heads by calling the git reset <commit id> command. Though the git log command will not show the commit info, git still stores it internally as a dangling object.

Users need not call the git prune command directly, but can call git gc, the git garbage collection command. This will prune the data along with many other housekeeping tasks.

Usage
The following command:

git prune –-dry-run

…will not carry out the prune operation. Instead, it will indicate what the output will be if this operation is performed.

The command:

git prune –-verbose

…will report all the actions and its associated objects.
The command given below:

git prune –-progress

…will show the progress made by git prune.
The following command:

git prune –-expire <time>

…will remove only the objects that are older than the stipulated time.
We will use a simple example to see how this works. To begin with, we will clone one of the git repositories or even create your own repository using git init. I will be cloning a repository for explaining the usage of the prune command:

% git clone https://github.com/hasura/graphql-engine.git

Cloning into ‘graphql-engine’...
remote: Enumerating objects: 66417, done.
remote: Counting objects: 100% (5997/5997), done.
remote: Compressing objects: 100% (2118/2118), done.
remote: Total 66417 (delta 3840), reused 5775 (delta 3718), pack-reused 60420
Receiving objects: 100% (66417/66417), 77.20 MiB | 3.11 MiB/s, done.
Resolving deltas: 100% (44018/44018), done.
Updating files: 100% (5104/5104), done.

We will now create a sample file with the text inserted in the first line. You can create this file in any text editor of your choice.

% cat >> hello.txt

This is the first line
Add the text file and commit:

% git add hello.txt
% git commit -m “add the file hello”

[master bec274f52] add the file hello
1 file changed, 1 insertion(+)
create mode 100644 hello.txt

Update the hello.txt file by adding a second line to it:

% cat >> hello.txt

This is the second line

% cat hello.txt #

This is the first line
This is the second line

Add and commit the same file again:

% git add hello.txt
% git commit -m “added second line in the file hello”

[master 3af6fea8d] added second line in the file hello
1 file changed, 1 insertion(+)

Check the log and search for two git commits.

% git log

commit 3af6fea8dbb44973ddeed856036d0792d15a4ad7 (HEAD -> master)
Author: xxxxxxx <xxxxxxx@xxxxxxx>
Date: Sun Jun 13 20:29:00 2021 +0530
added second line in the file hello

commit bec274f5264140c9ff1f9b8b6f956c69295a29d4
Author: xxxxxxx <xxxxxxx@xxxxxxx>
Date: Sun Jun 13 20:27:39 2021 +0530
add the file hello

Let’s now reset the head version of the file to the previous commit:

% git reset --hard bec274f5264140c9ff1f9b8b6f956c69295a29d4
HEAD is now at bec274f52 add the file hello

Check the git log again; you will find the last created commit has vanished.

% git log

commit bec274f5264140c9ff1f9b8b6f956c69295a29d4 (HEAD -> master)
Author: xxxxxxx <xxxxxxx@xxxxxxx>
Date: Sun Jun 13 20:27:39 2021 +0530
add the file hello

Next, we will try to find the dangling commit.

% git fsck --lost-found

Checking object directories: 100% (256/256), done.
Checking objects: 100% (66417/66417), done.
dangling commit 3af6fea8dbb44973ddeed856036d0792d15a4ad7

Match the details and check if it was the same commit:

% git show 3af6fea8dbb44973ddeed856036d0792d15a4ad7

commit 3af6fea8dbb44973ddeed856036d0792d15a4ad7
Author: xxxxxxx <xxxxxxx@xxxxxxx>
Date: Sun Jun 13 20:29:00 2021 +0530
added second line in the file hello
diff --git a/hello.txt b/hello.txt
index d3e2104b9..81bc58de5 100644
--- a/hello.txt
+++ b/hello.txt
@@ -1 +1,2 @@
This is the first line
+This is the second line

When prune is run now, it may have no effect as git may be maintaining the reference and not be fully detached. Run the reflog to expire all the entries that are older than now. It is also advised to run git gc rather than git prune.

% git reflog expire --expire=now --expire-unreachable=now --all

Remember to dry run the prune command before running it. This will let you check the changes that will take place by running prune:

% git prune --verbose --progress --expire=now

142cc07007d0107e8ff91b28e2a7b078a6f56d1c tree
3af6fea8dbb44973ddeed856036d0792d15a4ad7 commit
81bc58de5525b0c16f8cb74b89799017b6986981 blob

If you find the changes are as per your expectations, you can run the actual prune command:

% git prune --verbose --progress --expire=now

142cc07007d0107e8ff91b28e2a7b078a6f56d1c tree
3af6fea8dbb44973ddeed856036d0792d15a4ad7 commit
81bc58de5525b0c16f8cb74b89799017b6986981 blob

Finally, use fsck to check for the dangling commits. You will find that all of them have been cleared:

% git fsck --lost-found

Checking object directories: 100% (256/256), done.
Checking objects: 100% (66417/66417), done.

Alternate choices
Another way to clean up orphaned branches is by using the prune option but with the commands git fetch and git remove. Orphaned branches are not connected to other branches and have been left unused. It is a good practice to prune the branches that are not being used. The simplest way to do this is by using the command git fetch –prune. This will fetch all the remote branch refs, and delete the remote refs that are no longer in use in the remote repository. It’s advisable to use the dry run before execution, such as:

Thangaselvi Arichandrapandian

Next, let’s look at git remote prune. Just like git fetch –prune, this command will also remove the refs to the branches that don’t exist on the remote data. We can use git remote prune when we want to prune but not fetch the remote data. The git maintains both the local/origin and remote/origin refs. This command will only prune the refs from the remote/origin and leave the local/origin refs untouched. We can use just git prune to delete only the locally detached commits, as seen in the above example.

Now that you have seen the usage and different options of git prune, try and use it to keep your git clean. Happy housekeeping!

LEAVE A REPLY

Please enter your comment!
Please enter your name here