Golang archive/tar: re-add sparse file support

Hi @dsnet. Thanks for all the Go 1.10 archive/tar work. It's really an amazing amount of cleanup, and it's very well done.

The one change I'm uncomfortable with from an API point of view is the sparse hole support.

First, I worry that it's too complex to use. I get lost trying to read Example_sparseAutomatic - 99% of it seems to have nothing to do with sparse files - and I have a hard time believing that we expect clients to write all this code. Despite the name, nothing about the example strikes me as “automatic.”

Second, I worry that much of the functionality here does not belong in archive/tar. Tar files are not the only time that a client might care about where the holes are in a file or about creating a new file with holes, and yet somehow this functionality is expressed in terms of tar.Header and a new tar.SparseHole structure instead of tar-independent operations. Tar should especially not be importing and using such subtle bits of syscall as it is in sparse_windows.go.

It's too late to redesign this for Go 1.10, so I suggest we pull out this new API and revisit for Go 1.11.

For Go 1.11, I would suggest to investigate (1) what an appropriate API in package os would be, and (2) how to make archive/tar take advantage of that more automatically.

For example, perhaps it would make sense for package os to add

// Regions returns the boundaries of data and hole regions in the file.
// The result slice can be read as pairs of offsets indicating the location
// of initialized data in the file or, ignoring the first and last element,
// as pairs of offsets indicating the location of a hole in the file.
// The first element of the result is always 0, and the last element is
// always the size of the file.
// For example, if f is a 4-kilobyte file with data written only to the
// first and last kilobyte (and therefore a 2-kilobyte hole in the middle),
// Regions would return [0, 1024, 3072, 4096].
//
// On operating systems that do not support files with holes or do
// not support querying the location of holes in files,
// Regions returns [0, size].
//
// Regions may temporarily change the file offset, so it should not
// be executed in parallel with Read or Write operations.
func (f *File) Regions() ([]int64, error)

That would avoid archive/tar's current DetectParseHoles and SparseEntry, and the tar.Header only need to add a new field Regions []int64. (Regions is not a great name; better names are welcome.) Note that using a simple slice of offsets avoids the need for a special invertSparseEntries function entirely: you just change whether you read pairs starting at offset 0 or 1.

As for "punching holes", it suffices on Unix (as you know) to simply truncate the file (which Create does anyway) and then not write to the holes. On Windows it appears to be necessary to set the file type to sparse, but I don't see why the rest of sparsePunchWindows is needed. It seems crazy to me that it could possibly be necessary to pre-declare every hole location in a fresh file. The FSCTL_SET_ZERO_DATA looks like it is for making a hole in an existing file, not a new file. It seems like it should suffice to truncate the target file, mark it as sparse, set the file size, and then write the data. What's left should be automatically inferred as holes. If we were to add a new method SetSparse(bool) to os.File, then I would expect it to work on all systems to do something like:

f = Create(file)
f.SetSparse(true) // no-op on non-Windows systems, FSCTL_SET_SPARSE (only) on Windows
for each data chunk {
    f.WriteAt(data, offset)
}
f.Truncate(targetSize) // in case of final hole, or write last byte of file

Finally, it seems like handling this should not be the responsibility of every client of archive/tar. It seems like it would be better for this to just work automatically.

On the tar.Reader side, WriteTo already takes care of not writing to holes. It could also call SetSparse and use Truncate if present as an alternative to writing the last byte of the file.

On the tar.Writer side, I think ReadFrom could also take care of this. It would require making WriteHeader compute the header to be written to the file but delay the actual writing until the Write or ReadFrom call. (And that in turn might make Flush worth keeping around not-deprecated.) Then when ReadFrom is called to read from a file with holes, it could find the holes and add that information to the header before writing out the header. Both of those combined would make this actually automatic.

At the very least, it seems clear that the current API steps beyond what tar should be responsible for. I can easily see developers who need to deal with sparse files but have no need for tar files constructing fake tar headers just to use DetectSparseHoles and PunchSparseHoles. That's a strong signal that this functionality does not belong in tar as the primary implementation. (A weaker but still important signal is that to date the tar.Header fields and methods have not mentioned os.File explicitly, and it should probably stay that way.)

Let's remove this from Go 1.10 and revisit in Go 1.11. Concretely, let's remove tar.SparseEntry, tar.Header.SparseHoles, tar.Header.DetectSparseHoles, tar.Header.PunchSparseHoles, and the deprecation notice for tar.Writer.Flush.

Thanks again. Russ

Comment From: dsnet

(2) how to make archive/tar take advantage of that more automatically.

Automatic creation of sparse tar archives should actually be a non-goal. That is, we should not generate sparse archives with zero changes to user code. Instead, creation of sparse archives should occur with only 1 simple change to their code if they want sparse archives. In other words, make sparse support an easy opt-in, but not automatic.

The two most common implementations, GNU tar and BSD tar, both understand sparse headers. The problem lies with a long-tail of other tar implementations that do not understand sparse headers. For example, dpkg does not understand sparse files and it would be terrible if we started creating sparse archives against user expectation.

On the tar.Writer side, I think ReadFrom could also take care of this. It would require making WriteHeader compute the header to be written to the file but delay the actual writing until the Write or ReadFrom call.

For the reason I stated earlier I'm opposed to ReadFrom being fully automatic for creating sparse archives.

Also, I feel uncomfortable that WriteHeader is a lazy write. First, it's contrary to the naming of the method. Second, it feels weird to me that the implementation of Writer makes assumptions about OS specific functionality. Third, I have heard anecdotally of users relying on WriteHeader being non-lazy because they care about what parts of an archive are "headers" and what parts are "data".

On the tar.Reader side, WriteTo already takes care of not writing to holes. It could also call SetSparse and use Truncate if present as an alternative to writing the last byte of the file.

Automatic (attempt at) creation of sparse files on the filesystem sounds fine, since OS or FS that don't support them still write a valid file (except for NaCl; see #21728), but I do have hesitation about calling OS specific methods in WriteTo.

(1) what an appropriate API in package os would be

I support having all of the OS-specific sparse logic in package os. I share your concern about users abusing Header.DetectSparseHoles and Header.PunchSparseHoles to get at the OS functionality without caring about tar archive.

you just change whether you read pairs starting at offset 0 or 1.

I like how easy it is to convert between the two semantics, but I would feel more comfortable if there was some bit of type safety. We can discuss further when discussing the change to os.

Note that using a simple slice of offsets avoids the need for a special invertSparseEntries function entirely

invertSparseEntries is an implementation details. It has the useful property that it normalizes the offsets, so we would still need something like it. That is, if there are two holes adjacent to each other, invertSparseEntries combines them.

Comment From: gopherbot

Change https://golang.org/cl/78030 mentions this issue: archive/tar: partially revert sparse file support

Comment From: rsc

Thanks for the CL.

I see your point about the Writer not doing it automatically. I'm OK with that.

Reader.WriteTo doesn't seem like it really needs any OS-specific stuff at all. Assuming a new SetSparse method, it just has to sniff for SetSparse+Truncate+Seek. And actually Truncate is basically optional, it's just SetSparse and Seek.

Comment From: dsnet

I still think of SetSparse as OS-specific functionality, while Seek (even though it originates from fseek) is a common enough paradigm from the io package that it's not all that OS-specific.

Are you going to propose changes to os.File or should I write that up?

Comment From: rasky

Reader.WriteTo doesn't seem like it really needs any OS-specific stuff at all. Assuming a new SetSparse method, it just has to sniff for SetSparse+Truncate+Seek. And actually Truncate is basically optional, it's just SetSparse and Seek.

AFAICT, on Windows, you can't create sparse zero areas by seeking, as the MSDN documentation clearly states:

https://msdn.microsoft.com/it-it/library/windows/desktop/aa365566%28v=vs.85%29.aspx https://blogs.msdn.microsoft.com/oldnewthing/20110922-00/?p=9573/

The blog post hints that you can set the file sparse and create immediately a full-size sparse span, so that later writers+seeks would basically fragment it but leaving sparse areas under the seeks. I have no clue if there is an impact on performance with this approach, and also it doesn't really belong to a os.File.SetSparse API I would say.

Note that this was discussed at length in #13548, where also your proposal of lazy header writing was analyzed and discarded.

Comment From: mvo5

I stumbled upon this issue and looking at the history and the code and I wonder if there is a way to unblock Linux users even without additions to the os module. This would allow linux users to extract sparse files (which seems to be a relatively common request/issue). What I have in mind is something like this (all names of course just strawmans): 1. add an interface that indicates to tar that the target file supports sparseness, implementing this is the job of the user (not yet of the os package) 2. adjust the logic in tar.Reader.WriteTo() to check for this 3. write up the (already exiting) code to support sparse files with that

With (something like) that the users of the library can benefit from the work that already went into supporting sparse files and for everyone else nothing would change (so no risk of breaking). We could even mark it a non-stable API (although I don't think such a concept exists today?) if desired. But the risks seems low, once os understand sparse files this would just be removed and no client code would break because the previous "opt-in" for sparse files is now the default.

I attached a small PoC that I tested successfully locally with a very limited test-case. If there is a chance that this idea could get merged upstream I can look into a better version and submit upstream but I don't want to do this work before checking if the idea is acceptable :-) 0001-tar-support-sparse-files-for-targets-with-SparseTrun.patch.txt

Comment From: mvo5

Actually thinking about this, maybe we could even make it easier. It seems we have the ingredients on Linux already in the tar module, maybe we could just export the tar.Reader.WriteTo() on linux and create sparse files there by default if the writer also supports "Truncate()"? I attached a PoC for this idea as well and can also work on this (tests, proper upstream PR) if there is a chance this might get accepted :) 0001-tar-support-sparse-file-writing-on-linux.patch.txt

Comment From: glycerine

I've been coming to terms with sparse files on Linux and Darwin lately.

I'd like to see an archive/tar/v2 package that has automatic sparse support.

I'll just note here two hard won insights/hints from doing research and experiments on Apple File System (APFS) in particular over the last few days, which is a little trickier than Linux/ext4/XFS:

First, APFS determines automagically when to support sparse files at all, and the threshold appears to be at least 20MB in total apparent file size.

I tried in vain for days to get sparse file tests working against darwin and APFS, only to finally realize that nothing will work unless you start by truncating the file to at least 20MB. A 16MB test file is too small, and will not do. You will not be able to create both a sparse and a non-sparse region in the same 16MB file consistently, although you can create an entirely sparse file of any size.

Now, that's not quite true, but "true enough" for the purposes of creating large sparse files in an online fashion. You can "go back" and punch sparse holes into a file after you have densely allocated it, and that will work on smaller files. But, of course, if the dense allocation exhausts your disk space first, you are up a creek without a paddle, so that is a sub optimal approach for general purpose (large) sparse file creation.

To conclude the first hint: In order to effectively test creating a file with sparse and non-sparse regions on darwin, insure you are using a big enough total/apparent/maximum file size. You may want to go higher than 20MB, say start at 32MB or 64MB, because Apple leaves the magic unspecificied, and it might be different on ARM/ M1/M2/etc and between different MacOS versions, versus amd64 on Sonoma 14.0 which I'm testing on.

Since APFS has sparse file support, that initial os.File.Truncate call will not actually allocate 64MB, and in fact if you only seek and write 12KB worth of pages, your total disk use for the test will be 12KB. However, when my initial truncate was to only 16MB, my tests always ended up with a 16MB file on disk, densely allocated. Hopefully this saves others from going through the same pain of repeated what-the-heck? moments with Apple's magical poorly specified filesystem (grrr...)

Secondly and very surprisingly, the tar, bsdtar, and even cp on a darwin system can and will create sparse file output automatically, even if the origin is not sparse, because they are doing runtime detection of zero spans, and converting those automatically into sparse holes in the destination file.

I thought, and one might readily assume, that this would be too slow to be the default.

But apparently libarchive ( https://github.com/libarchive/libarchive ) makes it fast enough; the comments therein discuss using block size to optimize the zero detection. I don't fully follow it yet, but that is the approach. See also for example [1].

Interestingly, this means that a simple cp of a sparse file on APFS can end up creating a destination file with sparse holes in completely different locations than the original. (And it might even re-write the original to be sparse in a different way, so beware here!)

The insight here is that this is particularly nice for the end user who just wants to save disk space (usually on big VM images) and does not care whether the original file was sparsely encoded or not, or whether it came from a sparse tar file or not. The "reader-makes-right" -> "writer-makes-right" approach supports any origin such as over the network or from a tar archive that was not sparse aware.

Thus I've come to appreciate this approach for that reason: apparently it can be fast enough, and it gives the best user experience in the end: much less disk space consumed for one's large and sparse VM images (such as those used by Docker Desktop on MacOS).

I would recommend a similar approach (writer detects zero spans on the fly) for an archive/tar/v2, if possible.

Natural caveats: I'm not sure if this resolves the tension about when and how to write the header. I'm also not sure at all about Windows, having not experimented at all there yet.

[1] https://github.com/apple-oss-distributions/libarchive/blob/6aace39412e10de73424edc732ee4e4478eb40ef/libarchive/doc/text/archive_write_data.3.txt#L24

edit: I've seen comments above about preferring to generalize sparse file support beyond just tar, and I support such efforts, but I'm not sure of a better place to leave these hints. If you read this and know of issues that deal with the more general approach(es), please leave a pointer to them. Thank you!

Comment From: glycerine

A link to my library (and green tests!) of sparse filing handling procedures for Linux and Darwin:

https://github.com/glycerine/rpc25519/tree/master/jsync/sparsified