Go version
go version go1.24.6 linux/amd64
Output of go env
in your module/workspace:
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='0'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/john/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/john/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3904074702=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/john/tmp/gogenericopt/go.mod'
GOMODCACHE='/home/john/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/john/go:/home/john/src/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/john/go-1.24.6'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/john/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/john/go-1.24.6/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.6'
GOWORK=''
PKG_CONFIG='pkg-config'
What did you do?
Given the following code:
package gogenericopt
import (
"testing"
)
type Adder[T any] interface {
Add(a, b T) T
}
type Float interface {
~float32 | ~float64
}
type FloatAdder[T Float] struct{}
func (FloatAdder[T]) Add(a, b T) T { return a + b }
//go:noinline
func SumGenericAdder[A Adder[T], T any](a A, s []T) T {
var result T
for _, e := range s {
result = a.Add(result, e)
}
return result
}
//go:noinline
func SumFloatAdder[T Float](a FloatAdder[T], s []T) T {
var result T
for _, e := range s {
result = a.Add(result, e)
}
return result
}
func BenchmarkSumGenericAdder_float32(b *testing.B) {
s := genSlice[float32]()
for b.Loop() {
SumGenericAdder(FloatAdder[float32]{}, s)
}
}
func BenchmarkSumFloatAdder_float32(b *testing.B) {
s := genSlice[float32]()
for b.Loop() {
SumFloatAdder(FloatAdder[float32]{}, s)
}
}
func genSlice[T Float]() []T {
const size = 128
s := make([]T, size)
for i := range s {
s[i] = T(1.0 / size)
}
return s
}
and running go test -bench=.
...
What did you see happen?
goos: linux
goarch: amd64
pkg: example.com/gogenericopt
cpu: Intel(R) Core(TM) Ultra 7 155U
BenchmarkSumGenericAdder_float32-14 9139267 133.5 ns/op
BenchmarkSumFloatAdder_float32-14 28473896 41.75 ns/op
PASS
ok example.com/gogenericopt 2.414s
What did you expect to see?
I would have expected similar benchmark results for the two functions, since in both cases, the type of adder
is known at compile time to be
FloatAdder[float32]
.
It seems that in the case where the adder
's type is a
generic parameter constrained by Adder[T]
, the compiler is not trying to
inline the Add
method. Yet, if it's taken as a FloatAdder[T]
, the compiler
does inline the method call.
A type parameter T
is involved in both cases; the difference seems to be
whether the adder
type is constrained by an interface or is an instantiation
of a generic type with the type parameter T
.
Is this just a missed case in the optimizer, or is there some technical difficulty supporting the first case?
From go test -bench=. -gcflags=-m
:
# example.com/gogenericopt [example.com/gogenericopt.test]
./gogenericopt_test.go:51:6: can inline genSlice[go.shape.float32]
./gogenericopt_test.go:17:6: can inline FloatAdder[go.shape.float32].Add
./gogenericopt_test.go:17:6: can inline FloatAdder[float32].Add
./gogenericopt_test.go:51:6: can inline genSlice[float32]
./gogenericopt_test.go:39:2: skip inlining within testing.B.loop for for loop
./gogenericopt_test.go:46:2: skip inlining within testing.B.loop for for loop
./gogenericopt_test.go:38:24: inlining call to genSlice[go.shape.float32]
./gogenericopt_test.go:39:12: inlining call to testing.(*B).Loop
./gogenericopt_test.go:32:17: inlining call to FloatAdder[go.shape.float32].Add
./gogenericopt_test.go:45:24: inlining call to genSlice[go.shape.float32]
./gogenericopt_test.go:46:12: inlining call to testing.(*B).Loop
./gogenericopt_test.go:17:6: inlining call to FloatAdder[go.shape.float32].Add
./gogenericopt_test.go:51:6: inlining call to genSlice[go.shape.float32]
<autogenerated>:1: inlining call to FloatAdder[float32].Add
<autogenerated>:1: inlining call to FloatAdder[go.shape.float32].Add
<autogenerated>:1: inlining call to FloatAdder[go.shape.float32].Add
./gogenericopt_test.go:37:39: leaking param: b
./gogenericopt_test.go:38:24: make([]go.shape.float32, 128) does not escape
./gogenericopt_test.go:44:37: leaking param: b
./gogenericopt_test.go:45:24: make([]go.shape.float32, 128) does not escape
./gogenericopt_test.go:53:11: make([]go.shape.float32, 128) escapes to heap
./gogenericopt_test.go:51:6: make([]go.shape.float32, 128) escapes to heap
# example.com/gogenericopt.test
_testmain.go:39:6: can inline init.0
<autogenerated>:1: inlining call to reflect.flag.kind
<autogenerated>:1: inlining call to reflect.flag.kind
<autogenerated>:1: inlining call to reflect.flag.mustBe
<autogenerated>:1: inlining call to reflect.flag.kind
<autogenerated>:1: inlining call to reflect.flag.mustBe
<autogenerated>:1: inlining call to reflect.flag.kind
<autogenerated>:1: inlining call to reflect.flag.mustBeAssignable
<autogenerated>:1: inlining call to reflect.flag.mustBeAssignable
<autogenerated>:1: inlining call to reflect.flag.mustBeExported
<autogenerated>:1: inlining call to reflect.flag.mustBeExported
<autogenerated>:1: inlining call to reflect.flag.ro
<autogenerated>:1: inlining call to reflect.flag.ro
_testmain.go:45:42: testdeps.TestDeps{} escapes to heap
<autogenerated>:1: &reflect.ValueError{...} escapes to heap
<autogenerated>:1: &reflect.ValueError{...} escapes to heap
The assembly from -gcflags=-S
also makes it clear that the Add
call is not
inlined in the SumGenericAdder
instantiation but it is inlined in the
SumFloatAdder
instantiation. I can attach it if it would be useful.
Comment From: gabyhelp
Related Issues
- cmd/compile: does not inline method of generic type across packages when there are multiple instantiations #59070
- cmd/compile: combining two inlined functions with interfaces produces non-inlineable code #61036 (closed)
- cmd/compile: simple generics are not inlined #54497 (closed)
- cmd/compile: Go 1.19 might make generic types slower #54238 (closed)
- cmd/compile: generic functions are significantly slower than identical non-generic functions in some cases #50182 (closed)
- cmd/compile: functions with type parameters cannot inline multiple levels deep across packages #56280 (closed)
- cmd/compile: code generated by generics seems inefficient #64699 (closed)
- inline: the wiki says compiler optimization wouldn't inline a function containing panic's #46062 (closed)
- cmd/compile: get struct member when inlining #69935 (closed)
- cmd/compile: (dev.typeparams) assembly generated for a simple generic function is massive #46998 (closed)
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)