# git-sizer **Repository Path**: vcs-all-in-one/git-sizer ## Basic Information - **Project Name**: git-sizer - **Description**: Compute various size metrics for a Git repository, flagging those that might cause problems - **Primary Language**: Go - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 5 - **Forks**: 0 - **Created**: 2020-06-15 - **Last Updated**: 2025-05-10 ## Categories & Tags **Categories**: vcs **Tags**: None ## README _Happy Git repositories are all alike; every unhappy Git repository is unhappy in its own way._ —Linus Tolstoy # git-sizer Is your Git repository bursting at the seams? `git-sizer` computes various size metrics for a local Git repository, flagging those that might cause you problems or inconvenience. For example: * Is the repository too big overall? Ideally, Git repositories should be under 1 GiB, and (without special handling) they start to get unwieldy over 5 GiB. Big repositories take a long time to clone and repack, and take a lot of disk space. Suggestions: * Avoid storing generated files (e.g., compiler output, JAR files) in Git. It would be better to regenerate them when necessary, or store them in a package registry or even a fileserver. * Avoid storing large media assets in Git. You might want to look into [Git-LFS](https://git-lfs.github.com/) or [git-annex](http://git-annex.branchable.com/), which allow you to version your media assets in Git while actually storing them outside of your repository. * Avoid storing file archives (e.g., ZIP files, tarballs) in Git, especially if compressed. Different versions of such files don't delta well against each other, so Git can't store them efficiently. It would be better to store the individual files in your repository, or store the archive elsewhere. * Does the repository have too many references (branches and/or tags)? They all have to be transferred to the client for every fetch, even if your clone is up-to-date. Try to limit them to a few tens of thousands at most. Suggestions: * Delete unneeded tags and branches. * Avoid pushing your "remote-tracking" branches to a shared repository. * Consider using ["git notes"](https://git-scm.com/docs/git-notes) rather than tags to attach auxiliary information to commits (for example, CI build results). * Perhaps store some of your rarely-needed tags and branches in a separate fork of your repository that is not fetched from by normal developers. * Does the repository include too many objects? The more objects, the longer it takes for Git to traverse the repository's history, for example when garbage-collecting. Suggestions: * Think about whether you are storing very many tiny files that could easily be collected into a few bigger files. * Consider breaking your project up into multiple subprojects. * Does the repository include gigantic blobs (files)? Git works best with small- to medium-sized files. It's OK to have a few files in the megabyte range, but they should generally be the exception. Suggestions: * Consider using [Git-LFS](https://git-lfs.github.com/) for storing your large files, especially those (e.g., media assets) that don't diff and merge usefully. * See also the section "Is the repository too big overall?" * Does the repository include many, many versions of large text files, each one slightly changed from the one before? Such files delta very well, so they might not cause your repository to grow alarmingly. But it is expensive for Git to reconstruct the full files and to diff them, which it needs to do internally for many operations. Suggestions: * Avoid storing log files and database dumps in Git. * Avoid storing giant data files (e.g., enormous XML files) in Git, especially if they are modified frequently. Consider using a database instead. * Does the repository include gigantic trees (directories)? Every time a file is modified, Git has to create a new copy of every tree (i.e., every directory in the path) leading to the file. Huge trees make this expensive. Moreover, it is very expensive to traverse through history that contains huge trees, for example for `git blame`. Suggestions: * Avoid creating directories with more than a couple of thousand entries each. * If you must store very many files, it is better to shard them into a hierarchy of multiple, smaller directories. * Does the repository have the same (or very similar) files repeated over and over again at different paths in a single commit? If so, the repository might have a reasonable overall size, but when you check it out it balloons into an enormous working copy. (Taken to an extreme, this is called a "git bomb"; see below.) Suggestions: * Perhaps you can achieve your goals more effectively by using tags and branches or a build-time configuration system. * Does the repository include absurdly long path names? That's probably not going to work well with other tools. One or two hundred characters should be enough, even if you're writing Java. * Are there other bizarre and questionable things in the repository? * Annotated tags pointing at one another in long chains? * Octopus merges with dozens of parents? * Commits with gigantic log messages? `git-sizer` computes many size-related statistics about your repository that can help reveal all of the problems described above. These practices are not wrong per se, but the more that you stretch Git beyond its sweet spot, the less you will be able to enjoy Git's legendary speed and performance. Especially if your Git repository statistics seem out of proportion to your project size, you might be able to make your life easier by adjusting how you use Git. ## Getting started 1. Make sure that you have the [Git command-line client](https://git-scm.com/) installed, **version >= 2.6**. NOTE: `git-sizer` invokes `git` commands to examine the contents of your repository, so **it is required that the `git` command be in your `PATH`** when you run `git-sizer`. 2. Install `git-sizer`. Either: a. Install a released version of `git-sizer`(recommended): 1. Go to [the releases page](https://github.com/github/git-sizer/releases) and download the ZIP file corresponding to your platform. 2. Unzip the file. 3. Move the executable file (`git-sizer` or `git-sizer.exe`) into your `PATH`. b. Build and install from source. See the instructions in [`docs/BUILDING.md`](docs/BUILDING.md). 3. Change to the directory containing a full, non-shallow clone of the Git repository that you'd like to analyze. Then run git-sizer [