How can I find and remove an unknown giant object from my git repository?

Somewhere along the line to the huge huge file was added the git repository of the project that I have. When I go to clone this project on a new machine, the project seems to be stuck at 37% for the wait time. This project should be cloned in just a couple of minutes.

How do you know which object causes this long cloning time?

I know how to "git rm" files. Will it delete, even if it is some old object that exists only in history?

I don’t quite understand if, as soon as you “git rm” the file, if it is completely removed from the repository or just goes ahead.

Any help is deeply appreciated!

+3
source share
3 answers

Pro Git , , , , . : (. ):

  • :

    $ git gc
    
  • git. , ( , , ):

    # In the following command, replace the pack*.idx filename
    # with whatever filename you find in the .git/objects/pack
    # directory:
    $ git verify-pack -v .git/objects/pack/pack-3f8c0...bb.idx | sort -k 3 -n | tail -3
    e3f094f522629ae358806b17daf78246c27c007b blob   1486 734 4667
    05408d195263d853f09dca71d55116663690c27c blob   12908 3478 1189
    7a9eb2fba2b1811321254ac360970fc169ba2330 blob   2056716 2056872 5401
    
  • , :

    $ git rev-list --objects --all | grep 7a9eb2fb
    7a9eb2fba2b1811321254ac360970fc169ba2330 git.tbz2
    
  • :

    $ git log --pretty=oneline -- git.tbz2
    da3f30d019005479c99eb4c3406225613985a1db oops - removed large tarball
    6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 added git tarball
    
  • git filter-branch, :

    $ git filter-branch --index-filter \
       'git rm --cached --ignore-unmatch git.tbz2' -- 6df7640^..
    Rewrite 6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 (1/2)rm 'git.tbz2'
    Rewrite da3f30d019005479c99eb4c3406225613985a1db (2/2)
    Ref 'refs/heads/master' was rewritten
    
  • blob, gc , :

    $ rm -Rf .git/refs/original
    $ rm -Rf .git/logs/
    $ git gc
    Counting objects: 19, done.
    Delta compression using 2 threads.
    Compressing objects: 100% (14/14), done.
    Writing objects: 100% (19/19), done.
    Total 19 (delta 3), reused 16 (delta 1)
    
+5

:

git ls-tree <first-commit-hash> -r --long > 1.txt

grep , 10 .

+2
find / -size +10M -ls

Files larger than 10 MB will be found here (this is not exactly 10 MB).

Here's a great explanation that should help you a bit.

https://askubuntu.com/a/36114

Github has a nice record of deleting a specific file from all the faithful versions of the repository.

+1
source

All Articles