With the use of atomic.Pointer in package sync, this program is now generating the full compiled code for atomic.Pointer into its object file even though it's not being used at all:
% cat x.go
package p
import "sync"
var theMap sync.Map
% go build -gcflags=-S x.go
...
command-line-arguments.theMap SBSS size=32
go:info.command-line-arguments.theMap SDWARFVAR dupok size=46
0x0000 0a 63 6f 6d 6d 61 6e 64 2d 6c 69 6e 65 2d 61 72 .command-line-ar
0x0010 67 75 6d 65 6e 74 73 2e 74 68 65 4d 61 70 00 09 guments.theMap..
0x0020 03 00 00 00 00 00 00 00 00 00 00 00 00 01 ..............
rel 33+8 t=1 command-line-arguments.theMap+0
rel 41+4 t=31 go:info.sync.Map+0
...
%
The part I didn't omit with ... is all that should be here, plus maybe a package init. The omitted part appears to be the entire compilation of four different instantiations of sync/atomic.Pointer, plus all their supporting type declarations:
% grep CompareAndSwap.STEXT out
sync/atomic.(*Pointer[go.shape.interface {}]).CompareAndSwap STEXT dupok size=94 args=0x20 locals=0x20 funcid=0x0 align=0x0
sync/atomic.(*Pointer[go.shape.struct { sync.m map[interface {}]*sync.entry; sync.amended bool }]).CompareAndSwap STEXT dupok size=94 args=0x20 locals=0x20 funcid=0x0 align=0x0
sync/atomic.(*Pointer[sync.readOnly]).CompareAndSwap STEXT dupok size=108 args=0x18 locals=0x20 funcid=0x15 align=0x0
sync/atomic.(*Pointer[interface {}]).CompareAndSwap STEXT dupok size=108 args=0x18 locals=0x20 funcid=0x15 align=0x0
%
Given that there is no code being emitted for this package at all, there's no need to emit all of this. (It's possible they are referenced from the definition of go:info.sync.Map
, but that is not being emitted here either. The generated code correctly assumes that will be provided by package sync's object file.)
Overall the size increase is 33X since Go 1.19:
% go build -o x.a x.go
% go1.19 build -o x19.a x.go
% ls -l x.a x19.a
-rw-r--r-- 1 rsc primarygroup 73834 Nov 12 08:47 x.a
-rw-r--r-- 1 rsc primarygroup 2196 Nov 12 08:47 x19.a
% hoc -e 73834/2196
33.62204007285975
%
Obviously there is a small denominator in that 33X, but there is also very little use of generics yet.
There is a deeper problem than dead code here too.
When we were talking about instantiation in the early days of generics, one of the ways we discussed to avoid code size issues would be for packages to know which instantiations they use are already present in packages they import, so that definitions would not need to be generated repeatedly in multiple packages. For example, since package sync generated the definition for atomic.Pointer[any] any program directly or indirectly importing package sync can use atomic.Pointer[any] “for free” without itself regenerating that code. Of course, there's a search problem for answering the question of whether any package in the transitive import closure has already generated the code. But at the least, if we see a type in the ordinary course of reading export data for imported packages, we should mark it as not needed to be regenerated: anything in export data is handled by the package that wrote the export data. Applying that rule has no search problem.
In the standard library we are barely starting to use generics, yet we have 15 packages whose objects repeat the atomic.Pointer code from sync, 13 that repeat atomic.Pointer code from go/token, 9 that repeat an atomic.Pointer[string] from internal/godebug (in my working tree), and three that repeat an atomic.Pointer[response] from net/http:
% cd $GOROOT
% GODEBUG=installgorootall=1 go install std
% {for (i in `{find pkg/darwin_amd64 -name '*.a'}) go tool nm $i} |
sed 's/^[^ ]*//' | grep -v ' U ' | awk '{$1=""; print}' |
sort | uniq -c | sort -nr | awk '$1>1' >dups
% grep '\.CompareAndSwap$' dups
15 T sync/atomic.(*Pointer[sync.readOnly]).CompareAndSwap
15 T sync/atomic.(*Pointer[interface {}]).CompareAndSwap
15 T sync/atomic.(*Pointer[go.shape.struct { sync.m map[interface {}]*sync.entry; sync.amended bool }]).CompareAndSwap
15 T sync/atomic.(*Pointer[go.shape.interface {}]).CompareAndSwap
13 T sync/atomic.(*Pointer[go/token.File]).CompareAndSwap
13 T sync/atomic.(*Pointer[go.shape.struct { go/token.name string; go/token.base int; go/token.size int; go/token.mutex sync.Mutex; go/token.lines []int; go/token.infos []go/token.lineInfo }]).CompareAndSwap
9 T sync/atomic.(*Pointer[string]).CompareAndSwap
9 T sync/atomic.(*Pointer[go.shape.string]).CompareAndSwap
3 T sync/atomic.(*Pointer[net/http.response]).CompareAndSwap
3 T sync/atomic.(*Pointer[go.shape.struct { net/http.conn *net/http.conn; net/http.req *net/http.Request; net/http.reqBody io.ReadCloser; net/http.cancelCtx context.CancelFunc; net/http.wroteHeader bool; net/http.wroteContinue bool; net/http.wants10KeepAlive bool; net/http.wantsClose bool; net/http.canWriteContinue sync/atomic.Bool; net/http.writeContinueMu sync.Mutex; net/http.w *bufio.Writer; net/http.cw net/http.chunkWriter; net/http.handlerHeader net/http.Header; net/http.calledHeader bool; net/http.written int64; net/http.contentLength int64; net/http.status int; net/http.closeAfterReply bool; net/http.requestBodyLimitHit bool; net/http.trailers []string; net/http.handlerDone sync/atomic.Bool; net/http.dateBuf [29]uint8; net/http.clenBuf [10]uint8; net/http.statusBuf [3]uint8; net/http.closeNotifyCh chan bool; net/http.didCloseNotify sync/atomic.Bool }]).CompareAndSwap
%
For what it's worth, the objects we shipped in Go 1.19 appear to have the same duplication of atomic.Pointer[go/token.File], so the bug is not new.
15 doesn't look so bad, but when compiling Kubernetes we end up with 122 copies each of four different atomic.Pointer instantiations from package sync:
% cd kubernetes
% go build ./cmd/kubelet
% {for (i in `{go list -f '{{.Export}}' -export -deps ./cmd/kubelet}) go tool nm $i} |
sed 's/^[^ ]*//' | grep -v ' U ' | awk '{$1=""; print}' |
sort | uniq -c | sort -nr | awk '$1>1' >dups
% grep '\.CompareAndSwap$' dups
122 T sync/atomic.(*Pointer[sync.readOnly]).CompareAndSwap
122 T sync/atomic.(*Pointer[interface {}]).CompareAndSwap
122 T sync/atomic.(*Pointer[go.shape.struct { sync.m map[interface {}]*sync.entry; sync.amended bool }]).CompareAndSwap
122 T sync/atomic.(*Pointer[go.shape.interface {}]).CompareAndSwap
7 T sync/atomic.(*Pointer[string]).CompareAndSwap
7 T sync/atomic.(*Pointer[net/http.response]).CompareAndSwap
7 T sync/atomic.(*Pointer[go.shape.struct { net/http.conn *net/http.conn; net/http.req *net/http.Request; net/http.reqBody io.ReadCloser; net/http.cancelCtx context.CancelFunc; net/http.wroteHeader bool; net/http.wroteContinue bool; net/http.wants10KeepAlive bool; net/http.wantsClose bool; net/http.canWriteContinue sync/atomic.Bool; net/http.writeContinueMu sync.Mutex; net/http.w *bufio.Writer; net/http.cw net/http.chunkWriter; net/http.handlerHeader net/http.Header; net/http.calledHeader bool; net/http.written int64; net/http.contentLength int64; net/http.status int; net/http.closeAfterReply bool; net/http.requestBodyLimitHit bool; net/http.trailers []string; net/http.handlerDone sync/atomic.Bool; net/http.dateBuf [29]uint8; net/http.clenBuf [10]uint8; net/http.statusBuf [3]uint8; net/http.closeNotifyCh chan bool; net/http.didCloseNotify sync/atomic.Bool }]).CompareAndSwap
7 T sync/atomic.(*Pointer[go.shape.string]).CompareAndSwap
5 T sync/atomic.(*Pointer[go/token.File]).CompareAndSwap
5 T sync/atomic.(*Pointer[go.shape.struct { go/token.name string; go/token.base int; go/token.size int; go/token.mutex sync.Mutex; go/token.lines []int; go/token.infos []go/token.lineInfo }]).CompareAndSwap
%
If this is easy to fix it would be good to do it for Go 1.20. If not, then since Go 1.19 appears to behave the same way, it's probably not strictly a release blocker.
Comment From: mdempsky
I think this is blocked by #47448.
Comment From: mdempsky
@cherrymui points out it should be possible to cross-reference DUPOK symbols from other packages by just not defining them. This appears to demonstrate it working correctly across packages: https://go.dev/play/p/0SheVE56Lhm
Comment From: rsc
Agreed that #47448 is not fundamental to emitting undefined references to things in other packages.
In this case, however, the object file has no need to mention sync/atomic.Pointer[interface {}].CompareAndSwap
at all. It is not emitting any kind of reference at all to that code. It just emitted the code anyway. As far as the object file is concerned, it is dead space. But also any time spent actually generating the x86 instructions for the function is also dead CPU and RAM and build time that could be removed to make the build more efficient. So we should make sure that we don't just fix the problem by not emitting these functions. We should stop compiling them in the first place.
Comment From: stapelberg
Signal boost from Google-internal issue b/340857204:
It seems like this issue has become more pressing after CL 578815 added more generics to the reflect
package.
Every Go Protobuf generated code file (.pb.go
) imports the reflect
package so that it can call reflect.TypeOf
.
Despite not using the newly added symbols at all, the .a files for these Protobuf generated code targets have grown. Because Google programs use a large number of Protobufs, the collective size increase is pronounced enough to push many Google-internal programs over the edge in terms of allowed link action input size (“Forge: Total size of inputs too large”).
Would it be possible to prioritize this issue to get the .a file size down again? Thanks.
Comment From: arvidfm
I was looking into the issues mentioned in #29561 and #50438, which led me to conclude that this is the main culprit behind the bloated build cache and long compilation times that we're seeing. I posted a detailed investigation with numbers from a real-world codebase in #50438, but the short of it is that our codebase (which a year or so ago compiled in seconds) takes several minutes to compile following large-scale refactoring involving generics, and we end up with a 5 GB build cache with many individual ~300 MB package archives, most of which seems to be duplicated generic instantiations. The artifact sizes are further exacerbated by the long type names mentioned in #50438.
You can find a MWE at https://github.com/arvidfm/go-cache-bloat which shows how you can produce a 50 MB package archive from a 15 line package within a 130 line codebase.
This is a major issue for us as the slowdown from the recompilation of the instantiations makes it harder for us to iterate on our code, and the resulting size of the build cache artifacts made our CI environment run out of space so frequently that we had to create a script to prune our build cache before each build.