Golang runtime/pprof: goroutine profile at debug=2 hides the start of the finalizer goroutine

Go version

1.23

Output of `go env` in your module/workspace:

GO111MODULE='on'
GOARCH='arm64'
GOBIN='/usr/local/bin/'
GOCACHE='/Users/user/Library/Caches/go-build'
GOENV='/Users/user/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.23.7'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/user/Library/Application Support/go/telemetry'
GCCGO='gccgo'
GOARM64='v8.0'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/user/go/projects/github/cyolosecurity/cyolo-core/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/sk/l0sxrtq14h9ck0kv43dns5_h0000gn/T/go-build346811353=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Following this discussion I run a program with a blocking finalizer and compared the output of the goroutine profile (debug=1, debug=2).

package main

import (
    "fmt"
    "net/http"
    _ "net/http/pprof"
    "runtime"
    "sync"
    "time"
)

func main() {
    go func() {
        addr := fmt.Sprintf("0.0.0.0:%d", 6060)
        fmt.Printf("Starting PPROF: http://%s/debug/pprof/\n", addr)
        fmt.Printf("PPROF stopped: %s\n", http.ListenAndServe(addr, nil).Error())
    }()

    fmt.Println("Blocking finalizer forever...")
    mu := sync.Mutex{}
    mu.Lock()
    x := new([16]byte)
    runtime.SetFinalizer(x, func(_ *[16]byte) {
        mu.Lock()
    })
    runtime.KeepAlive(x)
    x = nil
    runtime.GC()

    for {
        time.Sleep(time.Second)
    }
}

What did you see happen?

debug=1

1 @ 0x102ed7c58 0x102eb5728 0x102eb5705 0x102ed90b8 0x102ef4744 0x1030af354 0x1030af301 0x102e7ce94 0x102edffc4
#   0x102ed90b7 sync.runtime_SemacquireMutex+0x27   /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/runtime/sema.go:95
#   0x102ef4743 sync.(*Mutex).lockSlow+0x173        /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/sync/mutex.go:173
#   0x1030af353 sync.(*Mutex).Lock+0x73         /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/sync/mutex.go:92
#   0x1030af300 main.main.func2+0x20            /Users/user/Library/Application Support/JetBrains/GoLand2024.3/scratches/scratch_71.go:24
#   0x102e7ce93 runtime.runfinq+0x3e3           /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/runtime/mfinal.go:255

debug=2

goroutine 33 [sync.Mutex.Lock]:
sync.runtime_SemacquireMutex(0x0?, 0x0?, 0x0?)
    /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/runtime/sema.go:95 +0x28
sync.(*Mutex).lockSlow(0x14000114108)
    /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/sync/mutex.go:173 +0x174
sync.(*Mutex).Lock(...)
    /Users/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.darwin-arm64/src/sync/mutex.go:92
main.main.func2(0x0?)
    /Users/user/Library/Application Support/JetBrains/GoLand2024.3/scratches/scratch_71.go:24 +0x74

What did you expect to see?

We anticipated that the debug=2 goroutine trace would include the runtime.runfinq function, showing the path to the finalizer's routine, just as it does in debug=1. The absence of this trace in debug=2 during our blocked finalizer debugging made identifying the root cause significantly more challenging as we found no occurrences of mfinal.go in the trace.

Comment From: gabyhelp

Related Issues

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

Comment From: prattmic

Thanks, I agree this is misleading and I think it should be fixed.

Both pprof and debug=1 profiles get the stacks from runtime.pprof_goroutineProfileWithLabels, which does not post-processing of the stack at all.

debug=2 profiles use runtime.Stack(b, true), which formats traceback the same way as panics, when GOTRACEBACK=all. i.e., excluding runtime frames, generated wrapper functions, etc.

The runtime.Stack implementation is actually much less efficient than runtime.pprof_goroutineProfileWithLabels, as the former is a full STW while stacks are collected, while the latter is concurrent since https://go.dev/cl/387415. IMO, runtime.Stack should switch to a concurrent implementation as well.

But that is an aside. While that would fix the immediate problem, I think this problem affects normal panics as well and should be fixed there as well.

I think what we want to do avoid hiding a runtime frame if it is the first frame in the goroutine, as that can make the apparent first frame change across different traces of the same goroutine, which shouldn't be possible and is confusing.

cc @golang/runtime @rhysh

Comment From: prattmic

I think this problem affects normal panics as well and should be fixed there as well.

Indeed.

package main

import (
        "runtime"
        "sync"
)

func main() {
        var mu sync.Mutex
        mu.Lock()
        b := new([16]byte)
        runtime.SetFinalizer(b, func(*[16]byte) {
                mu.Lock()
        })
        b = nil
        runtime.GC()

        panic("boo")
}

$ GOTRACEBACK=all go run .
panic: boo

goroutine 1 [running]:
main.main()
        /tmp/fintrace/main.go:18 +0xa5

goroutine 5 [sync.Mutex.Lock]:
internal/sync.runtime_SemacquireMutex(0x7ff5c99f0108?, 0x5?, 0x0?)
        /usr/lib/google-golang/src/runtime/sema.go:95 +0x25
internal/sync.(*Mutex).lockSlow(0xc0000100c0)
        /usr/lib/google-golang/src/internal/sync/mutex.go:149 +0x15d
internal/sync.(*Mutex).Lock(...)
        /usr/lib/google-golang/src/internal/sync/mutex.go:70
sync.(*Mutex).Lock(...)
        /usr/lib/google-golang/src/sync/mutex.go:46
main.main.func1(0x0?)
        /tmp/fintrace/main.go:13 +0x2d
exit status 2

Comment From: gopherbot

Change https://go.dev/cl/660295 mentions this issue: runtime: always show runfinq in traceback

Comment From: gopherbot

Change https://go.dev/cl/660296 mentions this issue: runtime: rename runfinq to runFinalizersAndCleanups

Comment From: rhysh

runtime.pprof_goroutineProfileWithLabels is both concurrent and parallel; the records are of fixed size, so it's not so hard to split the work across multiple Ps as they move goroutines across the profile's snapshot instant. It would be good for runtime.Stack to not have STW duration proportional to the number of goroutines, yes. It could be made concurrent, but I expect that the variable-size records (text in a []byte) would make parallelism tricky.

I think what we want to do avoid hiding a runtime frame if it is the first frame in the goroutine, as that can make the apparent first frame change across different traces of the same goroutine, which shouldn't be possible and is confusing.

I don't follow how the apparent first frame would change, can you say more about that?

Comment From: prattmic

I think what we want to do avoid hiding a runtime frame if it is the first frame in the goroutine, as that can make the apparent first frame change across different traces of the same goroutine, which shouldn't be possible and is confusing.

I don't follow how the apparent first frame would change, can you say more about that?

By "first frame", I meant outermost frame (which is not the terminology the traceback code itself uses, apologies).

So you may take a traceback and see

goroutine 5 [running]:
main.callee()
        /tmp/fintrace/main.go:33 +0x2d
main.finalizer1(0x0?)
        /tmp/fintrace/main.go:13 +0x2d

Then later take another traceback and see

goroutine 5 [running]:
main.callee()
        /tmp/fintrace/main.go:33 +0x2d
main.finalizer2(0x0?)
        /tmp/fintrace/main.go:23 +0x2d

The fact that main.finalizer1 is gone implies that it must have returned. But normally the outermost frame is the function started by the go keyword, so if that function returned the goroutine should have exited.

Golang runtime/pprof: goroutine profile at debug=2 hides the start of the finalizer goroutine

Go version

Output of go env in your module/workspace:

What did you do?

What did you see happen?

What did you expect to see?

Output of `go env` in your module/workspace: