Skip to content

Using git with large files

I’m using git to keep snapshots of large files. These are outputs of long runs of tools, that would take a long time to re-construct. Using git has the benefit of true version control on the input design files, and an archive of the large, time-consuming outputs.

The idea here, however, is to save time, not disk space. I’ve found that when I check in, git runs for a long time–likely trying to compress the data. It’s unlikely that the large files can be compressed very much (the input-to-output process is a bit chaotic). So, I want it to try less with finding deltas, etc.

The two options that do that are:


git config –add pack.window = 3 # default is 10

and

git config –add pack.depth = 3 # default is 50

The first says not to look at more than 3 “adjacent” (close in type, size, and name) objects for candidates of a delta. The second option says not to look at more than 3 previous versions for a delta. Basically, if you do a lot of branching and merging, the second one is of advantage. However, in my case, the development is pretty linear, so the best delta for any object is the one I checked in previously (not 50 back).

Be the first to like.

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*