Go version

go version go1.24.0 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/dottedmag/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/dottedmag/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/95/vzwcv5yd32x369z0c9t4bfr00000gn/T/go-build3662448907=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/dottedmag/tmp/gr-go-run/go.mod'
GOMODCACHE='/Users/dottedmag/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/dottedmag/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/dottedmag/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.0.darwin-arm64'
GOSUMDB='sum.golang.org'
GOTELEMETRY='off'
GOTELEMETRYDIR='/Users/dottedmag/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/dottedmag/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.0.darwin-arm64/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I have tried running go run and go tool with trivial programs, expecting them to exit nearly instantly.

Instead I'm seeing run times of ~50ms on Mac M1 Max with warm caches.

Here's a repository with a reproducer: https://github.com/dottedmag/gr-go-run

Run go test -bench=..

The basis for a comparison is a tool I wrote some time ago, before link-caching has been merged to Go. It uses a cache key computation algorithm that is fairly close to the original one (borrowing some code directly from Go), and still outperfroms go run 5ms to 50ms.

What did you see happen?

% go test -bench=.
goos: darwin
goarch: arm64
pkg: gr-go-run
cpu: Apple M1 Max
BenchmarkGr-10              237   4570610 ns/op
BenchmarkGoRun-10            22  51377985 ns/op
PASS
ok      gr-go-run   3.038s

What did you expect to see?

go run or go tool are expected to be at least on par with an external tool that does not hook into the compilation process, now that the linker outputs are cached.

Comment From: seankhliao

is this a corporate mac with endpoint security software running on it?

Comment From: dottedmag

Nope, no entrerprise-y shenanigans.

Apple's Gatekeeper is enabled, of course, but it should affect all the solutions equally.

Comment From: dmitshur

CC @matloob, @samthanawalla.

Comment From: iwahbe

I'm seeing this as well.

Comment From: matloob

The caching support for go run and go tool caches the output of the link step in the build. We still need to run all the actions in the build graph (looking up the cached output using the action id). We also still need to do module and package loading.

I'm sure there's a lot we can do to make the go command faster, and I would love to see it become faster, but this level of performance is pretty much what we expect.

I would definitely like to hear about use cases that are adversely impacted by this.

We would also definitely welcome changes that improve the performance of the go command without impacting its complexity.

Comment From: dottedmag

I would definitely like to hear about use cases that are adversely impacted by this.

With a fast caching Go could be used to write a wrapper around CLI tools. 5ms is acceptable, but 50ms is already perceptible for interactive invocation, and if it is a wrapper around a often-called tool (e.g. a wrapper for a compiler) then these milliseconds begin to add up quickly.

Comment From: iwahbe

My use case is around build tooling: wrapping a utility for integrating Go into Makefiles. I need to run the wrapped utility ~30 times per make invocation, so speed is critical (and developers wait in real time). Each invocation runs for ~20ms, and a >100ms overhead is unacceptable.

@dottedmag I've tried to work around this problem by fetching the tool path with go tool -n ${YOUR_TOOL}, and then invoking the binary directly. If you have somewhere to save state, that worked pretty well for me. It hit #72824 in CI though...

Comment From: dottedmag

@iwahbe

I've tried to work around this problem by fetching the tool path with go tool -n ${YOUR_TOOL}, and then invoking the binary directly.

The trickiest part is cache revalidation, as usual. I'm striving for a replacement for "go run" — a tool that can be used without thinking about stale caches.

With gr (see above) it's already much faster than running go list. I haven't spent much time optimizing it though, so I guess there's still a lot of performance left on the table. Now that I think about it, I have an idea how to get it down to a small number of syscalls with an follow-up execve() with little logic in between. This still isn't free, but I guess I could try to cut it down to under 1ms.

I understand that in your use-case you may be reasonably sure that the source code of your wrapper does not change under you as you're building things. In mine it's a source of frustration when I or somebody else on the team change branches and then 30 minutes later figure out the tool was stale.

Comment From: matloob

Thanks for your responses. I think for these use cases, we'd be happy to accept performance increase CLs that don't increase the complexity of the go command.

@iwahbe I'm curious why doing a go build -o <temporary location> <pkgpath> to build the binary to a temporary location, and then executing that binary wouldn't work for your use case. That would be a better way to get the latest cached copy of the tool.

Comment From: iwahbe

I use GOBIN=$(shell pwd)/bin/${HELPMAKEGO_VERSION} go install github.com/iwahbe/helpmakego@${HELPMAKEGO_VERSION} right now, but that requires that I manually manage versions. It works, but go tool would have been simpler if it was fast enough.

I don't want a global install, and running go build -o bin/helpmakego github.com/iwahbe/helpmakego requires updating go.mod, which go mod tidy then reverts:

❯ go build -o bin/v0.1.0/helpmakego github.com/iwahbe/helpmakego
no required module provides package github.com/iwahbe/helpmakego; to add it:
        go get github.com/iwahbe/helpmakego

Comment From: matloob

@iwahbe For the go build -o case, have you already added your tool to the module using go get -tool github.com/iwahbe/helpmakego before running go build -o? If you have a tool directive in your go.mod, go mod tidy won't remove the requirement.

Comment From: iwahbe

Running go get -tool github.com/iwahbe/helpmakego and then go build -o ... works.

Maybe the error message for go build -o bin/v0.1.0/helpmakego github.com/iwahbe/helpmakego should mention -tool when the go path used isn't a child of the current module.

That would be a better way to get the latest cached copy of the tool. (src)

If that is true, this I'm not sure why go tool allows adding external tools. I assumed that go tool allows external tools to provide a correct and easy to use way to run cached tools.

Comment From: matloob

That would be a better way to get the latest cached copy of the tool. (src)

If that is true, this I'm not sure why go tool allows adding external tools. I assumed that go tool allows external tools to provide a correct and easy to use way to run cached tools.

Sorry, I should have clarified: I was comparing using go build -o with go tool -n to get a binary path that you can execute directly. That's for the case where you need to optimize the performance of the tool being called many times. go tool itself is the best option for to build and run the tool.

Comment From: matloob

I'm going to close this issue for now. If you have suggestions for specific work that can be done on the go command that can improve performance without adding complexity, please file an issue and we can discuss the specific proposals.