Golang proposal: runtime: allow access to stopTheWorld/startTheWorld via go:linkname

Proposal Details

My module github.com/fjl/memsize walks a data structure and computes its total referred memory size in bytes, as well as producing a report of per-type memory usage. This is meant to be used for debugging memory leaks in live programs in a similar manner to taking a profile over HTTP with pprof, and to my knowledge there exists no other comparable tool.

memsize requires access to runtime.stopTheWorld and runtime.startTheWorld in order to safely walk the object graph without racing with the program that is being debugged. Up until Go version 1.22, I was able to call these functions via go:linkname. Go 1.23 disallows this, and as per https://github.com/golang/go/issues/67401 only specifically approved entry points can be linkname'd.

Please allow access to runtime.stopTheWorld and runtime.startTheWorld by adding a go:linkname directive for Go 1.23.

Alternatively, we could consider adding an exposed API, perhaps in package runtime/debug, to perform a similar task. I could envision something like:

func WithWorldStopped(reason string, f func())

Perhaps access to stopTheWorld/startTheWorld could be granted temporarily during the Go 1.23 cycle while we work out a replacement API.

xref https://github.com/fjl/memsize/issues/4 xref https://github.com/golang/go/issues/67401

Comment From: fjl

BTW, I wasn't sure which issue category was appropriate here. Originally went with Bug, but it didn't really fit either. Proposal seems kind of strongly-worded, I really just want to get my request in.

Comment From: seankhliao

given it is for debugging use, I didn't see an issue with requiring the checklinkname flag during a build.

Comment From: fjl

memsize is meant to be integrated into production executables, i.e. you typically set up the memsize web interface handler on the pprof endpoint and then access it the same way you'd access that. It's not like you'd specificallly build a debugging executable to use pprof. Most people just have it always enabled so they can go in and debug their service when it is in a state of failure.

Comment From: ianlancetaylor

CC @golang/runtime

Comment From: navytux

For the reference: go123's xruntime also hit this regression when trying to add support for go1.23: there stopTheWorld is used to serialize modification of program tracepoints wrt the program running itself:

https://lab.nexedi.com/kirr/go123/-/blob/8299741f/tracing/internal/xruntime/runtime.go#L24-38 https://lab.nexedi.com/kirr/go123/-/blob/8299741f/tracing/tracing.go#L211-225 https://pkg.go.dev/lab.nexedi.com/kirr/go123@v0.0.0-20240626173136-48920809d24c/tracing

lab.nexedi.com/kirr/go123/tracing was working ok since 2017 and it would be good to preserve support for that.

Thanks beforehand, Kirill

Comment From: rsc

stopTheWorld is very sensitive and difficult to use correctly. We are very unlikely to support calling it even using linkname. This is the kind of dangerous, difficult-to-keep-working use that the new linkname restrictions are meant to catch.

Comment From: aktau

Would an enhanced viewcore be an alternative? The desire to improve this is mentioned in https://github.com/golang/go/issues/43930#issuecomment-1852700705 (cc @mknyszek). If I'm not mistaken, the tracking issue is either https://github.com/golang/go/issues/57447 or https://github.com/golang/go/issues/45631.

Enhanced viewecore (gocore) would not support live debugging, but in my experience this is often just a nice-to-have. Most deployments have multiple concurrent tasks, managed by a cluster scheduler (Kubernetes, ...). Tasks go up and down for various reasons (machine reboot, crash, lack of resources, ...) and the job in its entirety should be tolerant to tasks disappearing. Hence, sending a signal to crash and produce a core dump (or heap dump) would be a decent way to identify the exact nature of leaks, for us.

Until this type of functionality is available, we're relegated to reading the code from the allocation stack of the leaked object (likely identified by diffing two heap profiles). This is (far) less productive for identifying leaks than the proposal or an improved core viewer.

Comment From: fjl

Enhanced viewcore would be nice, but it's kind of orthogonal and doesn't cover some use cases. I often use memsize by to take snapshot reports over time, checking for trends in the number of reachable objects. Can't do that if the program has to crash to get the heap.

Comment From: navytux

@rsc, thanks for feedback.

While I completely understand the desire of Go team to close access to private functionality, my reading of https://github.com/golang/go/issues/67401 is that the access will not be closed to what is already used from outside to prevent breaking existing applications and libraries. In other words for practical reason the set of symbols, that is already used from outside, will continue to be provided via go:linkname, and it is only new symbols, and symbols that were not used previously, that become inaccessible via go:linkname to prevent growing the set of private API that is exported in reality.

I agree that in ideal world the set of private API that is exported in reality should be empty, and that reaching that empty set is the long-term goal.

However here we are not starting from scratch. Over the years people used to have go:linkname working and started to depend on functionality, for which there is usually no public equivalent. For example in my tracing library with Go1.23 dropping access to stopTheWorld I'll have a hard time to find out what to do.

So I suggest to please reconsider providing access to stopTheWorld via go:linkname.

It has been only a few days since go1.23rc1 and it is already two projects speaking here due to breakage. I'm sure there will be more over time.

Kirill

Comment From: rsc

@fjl I don't fully understand why pprof isn't enough to catch memory leaks. Usually when there's a memory leak it shows up as the dominant memory source in a pprof heap profile.

Comment From: fjl

It's more for finding memory leaks where a large object graph is kept alive by a forgotten reference stuck in a slice backing array or channel buffer. AFAIK pprof heap profiles track new allocations of objects by location and type, which is useful, but it doesn't really give an absolute indication of what's still live. You can have lots of allocation activity which is all quickly reclaimed, and that would show up the same way, right?

Comment From: aktau

Enhanced viewcore would be nice, but it's kind of orthogonal and doesn't cover some use cases. I often use memsize by to take snapshot reports over time, checking for trends in the number of reachable objects. Can't do that if the program has to crash to get the heap.

I don't think that's true. You could use a combination of heap profiles and viewcore to find it:

Take heap profile at time X
Take heap profile at time X+10 (say, with 2x RSS)
Diff both profiles.
Observe largest remaining stack(s)
Kill a process that's big in a way that leaves a core/heapdump.
Using enhanced viewcore(1), find what's keeping the objects identified in step 4 alive.

It's more for finding memory leaks where a large object graph is kept alive by a forgotten reference stuck in a slice backing array or channel buffer. AFAIK pprof heap profiles track new allocations of objects by location and type, which is useful, but it doesn't really give an absolute indication of what's still live. You can have lots of allocation activity which is all quickly reclaimed, and that would show up the same way, right?

No, the heap profile contains four views:

inuse_space, inuse_objects: objects live as of the end of the last completed GC
alloc_space, alloc_objects: objects allocated, as of the end of the last completed GC, since start of process (this is what you were referring to)

Using inuse_space and inuse_objects, you could fulfill step 4 above.

Comment From: rsc

The pprof profile samples one allocation per half megabyte and records (number of bytes, number objects) x (allocated, freed). Based on those numbers, there are four profiles you can access: -inuse_space, -inuse_objects, -alloc_space, -alloc_objects. The inuse ones show only live data [allocated minus freed] as of the last GC (useful for identifying leaks), while the alloc ones show total allocation ever (useful for identifying sources of garbage). At this moment I believe the default that pprof shows is -inuse_space but I may be wrong about that.

Comment From: fjl

Yeah, it's fine, leak debugging could probably be done in a different way. I'm not really here to argue my tool is the best. Just brought up because it got broken by Go 1.23rc and I think I won't be the only one missing access to STW functionality in runtime. It's an important facility.

Do you think there is any chance we can find a way to expose access to STW, possibly with the new API I suggested?

Comment From: dominikh

Can't do that if the program has to crash to get the heap.

It doesn't. You can use tools such as gdb or gcore to get a coredump of a running process.

Comment From: fjl

Yeah, that's certainly a possibility.

Regarding viewcore, it's a bit funny, I wrote memsize originally as a workaround because the heapdump viewer tool (https://github.com/randall77/heapdump14) got broken by runtime changes in Go 1.6 (https://github.com/golang/go/issues/16410#issuecomment-233445705). heapdump14 itself was an attempt to fix an even earlier tool for changes made by Go 1.4 (removal of full type information in dumps). Seems like my workaround was good enough for 6 whole releases :).

Comment From: aclements

With memsize, what's the problem if you just don't stop the world? (Our best guess is that this prevents races that shear interface values, but wanted to ask.)

Comment From: aclements

Our policy with exposing linknames has been to look at transitive imports as a proxy for "importance". It seems like memsize doesn't meet this bar, but I can also imagine it's the type of package you import temporarily when debugging, so it wouldn't show up in an ecosystem analysis anyway.

Comment From: ianlancetaylor

Note that if you only import the package while debugging, it's possible to use the go build or go test option -ldflags=-checklinkname=0 while debugging.

Comment From: fjl

With memsize, what's the problem if you just don't stop the world? (Our best guess is that this prevents races that shear interface values, but wanted to ask.)

It would just race with everything (including map updates, slice growth, interfaces...), and would give an inconsistent view. The idea was to have a tool that reports on the memory usage of a snapshot of a data structure. And it can also be used to compare such snapshots.

We use it in go-ethereum by registering some of the 'root objects' that hold references to most things in the system. So when it looks like there is a memory-related problem, we can go to the debugging HTTP endpoint, click the button, and check the object counts. There is no special 'debug build', it's just good to have this capability built in.

If this proposal turns out rejected, we'll disable this and move on. It's not the end of the world for me :). But I do think STW is a useful primitive for debugging tools, and the runtime should either expose it in some way, or at least not prevent access to it.

Comment From: DemiMarie

Enhanced viewecore (gocore) would not support live debugging, but in my experience this is often just a nice-to-have. Most deployments have multiple concurrent tasks, managed by a cluster scheduler (Kubernetes, ...). Tasks go up and down for various reasons (machine reboot, crash, lack of resources, ...) and the job in its entirety should be tolerant to tasks disappearing. Hence, sending a signal to crash and produce a core dump (or heap dump) would be a decent way to identify the exact nature of leaks, for us.

I don’t think it is reasonable to assume that everryone is using a cluster scheduler, or even that most people are. For small-scale deployments, it adds pointless complexity.

Comment From: aktau

I don’t think it is reasonable to assume that everryone is using a cluster scheduler, or even that most people are. For small-scale deployments, it adds pointless complexity.

I shouldn't have said cluster scheduler. What I meant is that we're talking about debugging a task that has a memory leak. It's going to fail at some point. Making it fail in a way such that it produces a core dump may not be all that different. Additionally, while I haven't done this myself, @dominikh mentions it's possible to extract a core from a running process in https://github.com/golang/go/issues/68167#issuecomment-2194543530.

We use it in go-ethereum by registering some of the 'root objects' that hold references to most things in the system. So when it looks like there is a memory-related problem, we can go to the debugging HTTP endpoint, click the button, and check the object counts. There is no special 'debug build', it's just good to have this capability built in.

It does sound useful, one can view objects by retention stack instead of allocation stack (which normal heap profiles do). It reminds me of the difference between mutex profiles and blocking profiles. It'd be nice to have a "retention profile". Perhaps it could be a feature request? However I don't see a situation where a core/heap viewer couldn't yield the same information. Importantly though, such a viewer doesn't currently exist, so it's not a real alternative yet. Perhaps -ldflags=-checklinkname=0 isn't a bad workaround for tool/servers you are running until viewcore is fixed.

Comment From: rsc

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

Comment From: rsc

I wrote on June 26 that:

stopTheWorld is very sensitive and difficult to use correctly. We are very unlikely to support calling it even using linkname. This is the kind of dangerous, difficult-to-keep-working use that the new linkname restrictions are meant to catch.

The discussion so far has not convinced me otherwise. If you have a memory leak, pprof still seems like the right first step. That's always been enough for programs I've debugged. And if you really need to reach for memsize, you can import it and then build with -ldflags=-checklinkname=0.

The downsides of exposing stop-the-world to user code just seem too risky.

Comment From: rsc

Based on the discussion above, this proposal seems like a likely decline. — rsc for the proposal review group

Comment From: rsc

No change in consensus, so declined. — rsc for the proposal review group