proposal: new Recorder
types in runtime/pprof
Written primarily by me and @prattmic, but with input from the Go team. Consider this as a first take. We're hoping to shake out the details with the community.
Background
The runtime/pprof
package provides foundational Go APIs to collect runtime diagnostic data from Go programs. It is critical to Go's production stack, with 22,000+ public package imports.
For the most part this package serves us well, and most profiles are neatly captured by a global Profile
value. However, configuring each profile is messy. The sampling and collection configuration of each profile generally consists of some bespoke functions and/or global variables in the runtime
and runtime/pprof
packages. Meanwhile the profile format is captured by an opaque and slightly mysterious debug
integer. Things get worse as that there's other data that overlaps with the profile data. For the worst case of this complexity, see, for example, Felix Geisendörfer's goroutine profile feature matrix).
On top of all this, CPU profiles have a completely different API, and yet another new API that has an accepted proposal, but has stagnated before being implemented.
Goal
The existing API surface has three main problems:
- The documentation for configuration APIs is fragmented and confusing because it's split across multiple packages.
- The configuration APIs directly set global values without any clear ownership, which runs counter to our long-term goal of composing multiple profile consumers.
- It's unclear where new configuration options (such as the proposal to disable compression) should go.
The goal of this proposal is then to create a template for runtime diagnostics APIs that is clear, composable, and extendible.
Proposal
The core idea behind this proposal is to create a suite of recorder types which closely resemble the new FlightRecorder
API in the runtime/trace
package. Each recorder value is configured in its constructor, most recorders have a Start
and Stop
method where representing a window of time is appropriate, or otherwise implement the WriterTo
interface for profiles representing an instant or cumulative data since program start (such as for goroutine profiles and heap profiles1, respectively). We propose having one recorder type for each kind of profile and a separate recorder type for custom profiles. These recorder types will accept a bespoke set of configuration options in their constructors.
This proposal is intended to supersede https://github.com/golang/go/issues/42502.
In this proposal, we take the stance that these new recorder types should prefer to expose the Start
and Stop
methods over the WriterTo
interface where possible. That is, we choose to focus on delta profiles over snapshot profiles. (Collecting delta profiles is already something supported by the net/http/pprof
package, but is not supported in any Go APIs.)
API
Given the core idea, the actual proposed API changes are relatively simple and straightforward, although the API surface is somewhat large. Below is a full listing of all the different recorders we propose. We omit the thread creation profile because it's not useful. (See https://github.com/golang/go/issues/6104.)
package runtime/pprof
type CPURecorder struct { ... }
func CPURecorderConfig struct {
// Period sets the duration between profile samples.
//
// If no value is set, the sample period for the duration is implementation-defined.
// This implementation-defined value is independent of [runtime.SetCPUProfileRate],
// but the rate set here may be visible to consumers of [StartCPUProfile], so callers
// are discouraged from using both CPURecorder and StartCPUProfile in the same
// program.
Period time.Duration
}
func NewCPURecorder(CPURecorderConfig) (*CPURecorder, error)
// Start applies the recorder's configuration and begins global collection of
// CPU samples.
//
// Returns an error if this recorder had already been started.
func (*CPURecorder) Start(io.Writer) error
// Stop completes collection of the CPU profile.
//
// Returns an error on any failure to write to the io.Writer provided to Start,
// or if the recorder had not been started.
func (*CPURecorder) Stop() error
type AllocRecorder struct { ... }
func AllocRecorderConfig struct {
// BytesPerSample sets the maximum number of bytes allocated between samples.
//
// If no value is set, the sample period for the duration is implementation-defined.
// This implementation-defined value is independent of [runtime.MemProfileRate],
// but the rate set here may be visible to consumers of [Profile.WriteTo].
BytesPerSample int64
}
func NewAllocRecorder(AllocRecorderConfig) *AllocRecorder
// Start applies the recorder's configuration and takes a snapshot of the profile.
//
// Returns an error if this recorder had already been started.
func (*AllocRecorder) Start(io.Writer) error
// Stop takes a second snapshot and computes the difference with the profile
// taken at start. The resulting profile is written to the io.Writer provided
// to Start. In addition to a profile of memory allocations over this time window,
// the resulting profile also contains a delta of the contents of the live heap
// between the two snapshots, which may be useful for identifying memory leaks.
//
// Returns an error on any failure to write to the io.Writer provided to Start,
// or if the recorder had not been started.
func (*AllocRecorder) Stop() error
type HeapRecorder struct { ... }
func HeapRecorderConfig struct {
// None for now.
}
func NewHeapRecorder(HeapRecorderConfig) *HeapRecorder
// WriteTo writes a sampled profile of the live heap to the provided io.Writer.
//
// This profile also includes information on all allocations sampled since the program
// started. The sampling rate is controlled by [runtime.MemProfileRate].
func (*HeapRecorder) WriteTo(io.Writer) (int, error)
type BlockRecorder struct { ... }
func BlockRecorderConfig struct {
// EventsPerSample sets the number of goroutine block events between samples.
//
// If no value is set, the sample period for the duration is implementation-defined.
// This implementation-defined value is independent of [runtime.SetBlockProfileRate],
// but the rate set here may be visible to consumers of [Profile.WriteTo].
EventsPerSample int
}
func NewBlockRecorder(BlockRecorderConfig) (*BlockRecorder, error)
// Start applies the recorder's configuration and takes a snapshot of the profile.
//
// Returns an error if this recorder had already been started.
func (*BlockRecorder) Start(io.Writer) error
// Stop takes a second snapshot and computes the difference with the profile
// taken at start. The resulting profile is written to the io.Writer provided
// to Start.
//
// Returns an error on any failure to write to the io.Writer provided to Start,
// or if the recorder had not been started.
func (*BlockRecorder) Stop() error
type MutexRecorder struct { ... }
func MutexRecorderConfig struct {
// EventsPerSample sets the maximum number of unlock events between samples.
//
// If no value is set, the sample period for the duration is implementation-defined.
// This implementation-defined value is independent of [runtime.SetMutexProfileRate],
// but the rate set here may be visible to consumers of [Profile.WriteTo].
EventsPerSample int
}
func NewMutexRecorder(MutexRecorderConfig) (*MutexRecorder, error)
// Start applies the recorder's configuration and takes a snapshot of the profile.
//
// Returns an error if this recorder had already been started.
func (*MutexRecorder) Start(io.Writer) error
// Stop takes a second snapshot and computes the difference with the profile
// taken at start. The resulting profile is written to the io.Writer provided
// to Start.
//
// Returns an error on any failure to write to the io.Writer provided to Start,
// or if the recorder had not been started.
func (*MutexRecorder) Stop() error
type GoroutineRecorder struct { ... }
func GoroutineRecorderConfig struct {
Format GoroutineProfileFormat
}
// GoroutineProfileFormat is an enumeration of available formats for writing
// out the goroutine profile.
type GoroutineProfileFormat int
const (
PprofGoroutineProfile GoroutineProfileFormat = iota // Default gzipped protobuf.
TextGoroutineProfile // Legacy text profile.
TracebackGoroutineProfile // Matches default traceback format.
)
func NewGoroutineRecorder(MutexRecorderConfig) (*GoroutineRecorder, error)
// WriteTo snapshots the state of all goroutines, assembles it into a profile,
// and writes the result to the provided io.Writer.
func (*GoroutineRecorder) WriteTo(io.Writer) (int, error)
// ProfileRecorder is a generic profile recorder that works for any profile.
//
// It does not provide as much customizability as the more specific types,
// but works with any Profile, include custom Profiles.
type ProfileRecorder struct { ... }
func ProfileRecorderConfig struct {
// None for now.
}
func NewProfileRecorder(*Profile, ProfileRecorderConfig) (*ProfileRecorder, error)
// Start applies the recorder's configuration and takes a snapshot of the profile.
//
// Returns an error if this recorder had already been started.
func (*ProfileRecorder) Start(io.Writer) error
// Stop takes a second snapshot and computes the difference with the profile
// taken at start. The resulting profile is written to the io.Writer provided
// to Start.
//
// Returns an error on any failure to write to the io.Writer provided to Start,
// or if the recorder had not been started.
func (*ProfileRecorder) Stop() error
// WriteTo emits all profile data collected since program start until this point.
func (*ProfileRecorder) WriteTo(io.Writer) (int, error)
Rationale
For the general structure of the API, with bespoke types representing some configuration, the rationale can be found in the lengthy discussion around the FlightRecorder
proposal. In short, we want to support multiple consumers with different configurations (hence configurations are values) and we want to give room for configuration options to grow (hence the many recorder types we propose, instead of just one to rule them all). In some ways this proposal is just applying the insights and lessons from the FlightRecorder
proposal to the runtime/pprof
package.
What is new in this proposal is our focus on delta profiles. We have two reasons for this.
First, delta profiles compose much more cleanly. Long-term, we want to move toward being able to compose multiple profile consumers, but specifically by having configuration options specify a minimum requirement on how much detail is present in the profile. This is much harder to do for profile data that has been collected since program start. Although such profile data is useful, the API that already exists is about as good as we can do anyway.
Second, delta profiles match the models of CPU profiling and runtime execution traces far more closely. This means a more uniform API surface across all our diagnostics, leading to better discoverability of diagnostics, diagnostic configuration options, and a space to grow additional configuration options for each profile type.
As mentioned earlier, we still have to make some exceptions to delta profiles. Certain profiles represent a single instant in time, like goroutine profiles. The heap profile also represents an instant, specifically with the live heap part of the profile. (The heap profile and alloc profile are really the same thing, so this part can get a bit tricky. We choose to continue to have separate types for them, and when taking one of these profiles, the other one simply comes along for the ride.)
Composing consumers
Note that with some of the API above, each consumer may set its own desired sampling rate. For this to compose, the runtime must adjust its internal sampling to the maximum of all requested sampling rates.
This seems simple at first glance, but comes with a significant complication. Namely, the pprof format only supports a single global sampling rate so there's no way to indicate that some samples have different weights. The tooling also all makes the same assumption as a result.
Long-term, we believe the fix to this will be random downsampling to the desired rate. For this to work correctly we will need to change the implementation to track the rate that each sample was collected under, as currently all this data is aggregated away. With this information, randomly decimating samples in each sample rate group to the requested sampling rate should be sufficient. Implementing this will likely require some significant restructuring to the sampling bucket infrastructure in the runtime.
However, we need not block the new APIs on this work. For one, we take note that, by and large, diagnostics consumers do not adjust the sampling rate, or only do so infrequently. And it's already the case today that the sampling rate parameters should not (or simply cannot) change while profiling is active.
Therefore, I propose the following near-term compromise: if a consumer requests a sampling rate that is identical to the current sampling rate, new consumers are allowed to subscribe. Otherwise, Start
returns a descriptive error explaining the issue and the current sampling rate. This compromise is already a step forward because it would allow multiple concurrent profiling consumers at all, and supports the common case without much additional effort required.
Comment From: gabyhelp
Related Issues
- runtime/pprof,net/http/pprof: improve delta profiles efficiency and correctness #67942
- proposal: new net/http/pprof/v2 package #74544
- proposal: runtime/pprof: extend profile configuration to support perf events #53286
Related Code Changes
- runtime/pprof: write profiles in protobuf format.
- runtime: add CPU samples to execution trace
- runtime/pprof: add counting profile and sampling
- runtime/pprof: note different between go test -memprofile and WriteHeapProfile
- runtime/pprof: write profiles in protobuf format.
- runtime/pprof: introduce "allocs" profile
- runtime/pprof: use new profile buffers for CPU profiling
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
Comment From: prattmic
We discussed this proposal briefly at the Go Contributor Summit at GopherCon Europe (https://go.dev/s/gceu25-summit). Working from memory, I believe the discussion included @felixge @kakkoyun @seankhliao @qmuntal (apologies to those I forgot).
We mostly discussed how to handle composing consumers that have different sample rates. One idea was to allow multiple subscribers but only one "owner" of the sample rate. At the time, I imagined this as an explicit API. What you have proposed is actually quite similar, but it happens implicitly in the existing API.
if a consumer requests a sampling rate that is identical to the current sampling rate, new consumers are allowed to subscribe. Otherwise, Start returns a descriptive error explaining the issue and the current sampling rate.
Comment From: prattmic
(The heap profile and alloc profile are really the same thing, so this part can get a bit tricky. We choose to continue to have separate types for them, and when taking one of these profiles, the other one simply comes along for the ride.)
This feels a bit awkward to me. AllocRecorder
seems fine. You get a delta allocation profile plus a bonus snapshot of the live heap.
HeapRecorder
is the awkward one since it isn't a delta profile at all. You get a snapshot heap profile plus a bonus allocation profile since program start. Why since program start?
I acknowledge that folks probably do want a way to get heap+allocs together since that is how things have always worked, but I think it would be OK to limit that to AllocRecorder
(it could also be an option).
-
We distinguish "heap" profiles from "alloc" profiles, as
runtime/pprof
does, even though the data is usually exported together in the same profile. ↩