In BootZipCopyAction and AbstractJarWriter SHA1 hash is calculated for stored entries requiring unpack and set as entry comment:

archiveEntry.setComment("UNPACK:" + FileUtils.sha1Hash(details.getFile()));

However the hash isn't used anywhere, just the marker prefix UNPACK: is checked.

Dug through history and first introduction of UNPACK: + hash seems to have been in https://github.com/spring-projects/spring-boot/commit/f30b962ff9327d3264d519853992aff972cdbe5a

At that point the hash was extracted and used for the filename:

AsciiBytes hash = data.getComment().substring(UNPACK_MARKER.length());
...
File file = new File(getTempUnpackFolder(), hash.toString() + "-" + name);

But then later in https://github.com/spring-projects/spring-boot/commit/7e718cda265b85ad6337a0aa711906c7e1097e34 this got removed from output filename:

-       AsciiBytes hash = data.getComment().substring(UNPACK_MARKER.length());
-       File file = new File(getTempUnpackFolder(), hash.toString() + "-" + name);
+       File file = new File(getTempUnpackFolder(), name);

So now the hash is still being calculated and set for the entry but I don't see it being used anywhere. Could it just be left as UNPACK: without any hash?

The hashing reads the file completely in memory due to usage of DigestInputStream.readAllBytes() so in the extreme case the file is fully read 3 times: once by CrcAndSize calculation, once for the sha1 hash and once for actual copying to the ZipArchiveOutputStream

Comment From: philwebb

Although it's unlikely, I guess it's possible that existing tooling might be using the hash. I think we should fix the readAllBytes() issue in 3.4 and drop it entirely in 4.0.x

Comment From: deathy

Agreed users might be unknown and would be good to only have change in 4.0.x so next major.

As far as I saw in original changes it was jruby as main user? I could look into some jruby projects/samples and if they still require that behavior or not.

FYI I saw this by looking into some slow bootJar execution times and this is first thing I saw that's obvious. Kind of opposite of my previous investigations in reading from archives in https://github.com/spring-projects/spring-boot/issues/40125 Another would be reading once in memory and using it for archive entry and also for CrcAndSize calculation only once. And if having lots of incompressible entries having the option of stored for all entries. (Which gradle jar task has but bootJar ignores always forcing deflate with default level for non-stored)

I'm still studying things around archive handling, I could work on a PR for removing hash and others. Just first question would be if there's going to be significant changes around it soon on 4.0.x branch (I saw lots of module re-organization)

Comment From: wilkinsona

I've opened https://github.com/spring-projects/spring-boot/issues/46202 to address the inefficiency without changing the form of the UNPACK: comments. We can use this issue to change the comments in 4.0.

Comment From: wilkinsona

Closing in favor of #46520. Thanks for the PR, @academey.