Go version
go version go1.25.0 linux/amd64
Output of go env
in your module/workspace:
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/antoni/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/antoni/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3767597687=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/antoni/Dokumenty/Programowanie/Go/checked-go/go.mod'
GOMODCACHE='/home/antoni/go/pkg/mod'
GONOPROXY='github.com/antoniszymanski/*'
GONOSUMDB='github.com/antoniszymanski/*'
GOOS='linux'
GOPATH='/home/antoni/go'
GOPRIVATE='github.com/antoniszymanski/*'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/antoni/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.25.0'
GOWORK=''
PKG_CONFIG='pkg-config'
What did you do?
I was implementing a checked multiplication operator using generics, and didn't want to expose eight multiplication functions. Here is the Go Playground link: https://go.dev/play/p/neO_wjvQoO3.
What did you see happen?
go build -o bin main.go
go tool objdump -S -s main\.main bin
TEXT main.main(SB) /home/antoni/Dokumenty/Programowanie/Go/checked-go/main/main.go
func main() {
0x46fa80 55 PUSHQ BP
0x46fa81 4889e5 MOVQ SP, BP
0x46fa84 4883ec18 SUBQ $0x18, SP
println(Mul(a, b))
0x46fa88 0fb60df2070900 MOVZX main.b(SB), CX
0x46fa8f 0fb61dea070900 MOVZX main.a(SB), BX
0x46fa96 488d0543330300 LEAQ main..dict.Mul[uint8](SB), AX
0x46fa9d 0f1f00 NOPL 0(AX)
0x46faa0 e8bb000000 CALL main.Mul[go.shape.uint8](SB)
0x46faa5 88442415 MOVB AL, 0x15(SP)
0x46faa9 885c2414 MOVB BL, 0x14(SP)
0x46faad e8ee8cfcff CALL runtime.printlock(SB)
0x46fab2 0fb6542415 MOVZX 0x15(SP), DX
0x46fab7 0fb6c2 MOVZX DL, AX
0x46faba e8a192fcff CALL runtime.printuint(SB)
0x46fabf 90 NOPL
0x46fac0 e8db8efcff CALL runtime.printsp(SB)
0x46fac5 0fb6442414 MOVZX 0x14(SP), AX
0x46faca e8518ffcff CALL runtime.printbool(SB)
0x46facf e80c8ffcff CALL runtime.printnl(SB)
0x46fad4 e8278dfcff CALL runtime.printunlock(SB)
println(mulUint8(a, b))
0x46fad9 0fb615a1070900 MOVZX main.b(SB), DX
0x46fae0 0fb63599070900 MOVZX main.a(SB), SI
res := uint16(a) * uint16(b)
0x46fae7 0faff2 IMULL DX, SI
0x46faea 6689742416 MOVW SI, 0x16(SP)
if res <= math.MaxUint8 {
0x46faef 6681feff00 CMPW SI, $0xff
0x46faf4 7602 JBE 0x46faf8
0x46faf6 31f6 XORL SI, SI
println(mulUint8(a, b))
0x46faf8 4088742413 MOVB SI, 0x13(SP)
0x46fafd 0f1f00 NOPL 0(AX)
0x46fb00 e89b8cfcff CALL runtime.printlock(SB)
if res <= math.MaxUint8 {
0x46fb05 0fb7442416 MOVZX 0x16(SP), AX
0x46fb0a 663dff00 CMPW AX, $0xff
0x46fb0e 0f96c0 SETBE AL
0x46fb11 88442412 MOVB AL, 0x12(SP)
println(mulUint8(a, b))
0x46fb15 0fb6442413 MOVZX 0x13(SP), AX
0x46fb1a 0fb6c0 MOVZX AL, AX
0x46fb1d 0f1f00 NOPL 0(AX)
0x46fb20 e83b92fcff CALL runtime.printuint(SB)
0x46fb25 e8768efcff CALL runtime.printsp(SB)
0x46fb2a 0fb6442412 MOVZX 0x12(SP), AX
0x46fb2f e8ec8efcff CALL runtime.printbool(SB)
0x46fb34 e8a78efcff CALL runtime.printnl(SB)
0x46fb39 e8c28cfcff CALL runtime.printunlock(SB)
}
0x46fb3e 4883c418 ADDQ $0x18, SP
0x46fb42 5d POPQ BP
0x46fb43 c3 RET
The call to Mul[uint8] is not inlined.
The call to mulUint8 is inlined.
3. go build -gcflags=m
# command-line-arguments
./main.go:56:6: can inline mulUint8[go.shape.uint8]
./main.go:65:6: can inline mulInt8[go.shape.int8]
./main.go:74:6: can inline mulUint16[go.shape.uint16]
./main.go:83:6: can inline mulInt16[go.shape.int16]
./main.go:92:6: can inline mulUint32[go.shape.uint32]
./main.go:101:6: can inline mulInt32[go.shape.int32]
./main.go:110:6: can inline mulUint64[go.shape.uint64]
./main.go:119:6: can inline mulInt64[go.shape.int64]
./main.go:56:6: can inline mulUint8[uint8]
./main.go:110:6: can inline mulUint64[uint64]
./main.go:101:6: can inline mulInt32[int32]
./main.go:92:6: can inline mulUint32[uint32]
./main.go:83:6: can inline mulInt16[int16]
./main.go:74:6: can inline mulUint16[uint16]
./main.go:65:6: can inline mulInt8[int8]
./main.go:16:6: can inline Mul[uint8]
./main.go:21:23: inlining call to mulUint8[go.shape.uint8]
./main.go:24:22: inlining call to mulInt8[go.shape.int8]
./main.go:29:24: inlining call to mulUint16[go.shape.uint16]
./main.go:32:23: inlining call to mulInt16[go.shape.int16]
./main.go:37:24: inlining call to mulUint32[go.shape.uint32]
./main.go:40:23: inlining call to mulInt32[go.shape.int32]
./main.go:45:24: inlining call to mulUint64[go.shape.uint64]
./main.go:48:23: inlining call to mulInt64[go.shape.int64]
./main.go:148:18: inlining call to mulUint8[go.shape.uint8]
./main.go:56:6: inlining call to mulUint8[go.shape.uint8]
./main.go:119:6: inlining call to mulInt64[go.shape.int64]
./main.go:110:6: inlining call to mulUint64[go.shape.uint64]
./main.go:101:6: inlining call to mulInt32[go.shape.int32]
./main.go:92:6: inlining call to mulUint32[go.shape.uint32]
./main.go:83:6: inlining call to mulInt16[go.shape.int16]
./main.go:74:6: inlining call to mulUint16[go.shape.uint16]
./main.go:65:6: inlining call to mulInt8[go.shape.int8]
./main.go:52:9: "unreachable" escapes to heap
The compiler says it can inline Mul[uint8].
What did you expect to see?
The call to Mul[uint8] should be inlined. This issue is similar to #75056.
Comment From: gabyhelp
Related Issues
- cmd/compile: unneccesary 0 check in division #45928 (closed)
- cmd/compile: double zeroing and unnecessary copying/stack use #67957 (closed)
- Signed non-negative divide by power of 2. #44530 (closed)
- cmd/compile: a constant expression is moved into a loop #71443
- cmd/compile: (dev.typeparams) assembly generated for a simple generic function is massive #46998 (closed)
- cmd/compile: optimize variables in function calls as equivalent to constant-derived variables #29166
- cmd/compile: loads combining regression #18946 (closed)
- cmd/compile: eliminate more bounds checks #17370
- cmd/compile: stringtoslicebytetmp optimization on unescaped slice #38501 (closed)
- cmd/compile: regression: unnecessary spilling to the stack #61356 (closed)
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)