Go version
go version go1.25.0 darwin/arm64
Output of go env in your module/workspace:
AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/tamird/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/tamird/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/1b/gkj6r3fx7tl0r4sr35x26tkh0000gp/T/go-build796780316=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/tamird/src/synctestrepro/go.mod'
GOMODCACHE='/Users/tamird/go/1.25.0/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/tamird/go/1.25.0'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/tamird/.goenv/versions/1.25.0'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/tamird/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/tamird/.goenv/versions/1.25.0/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.25.0'
GOWORK=''
PKG_CONFIG='pkg-config'
What did you do?
I ported some existing tests to testing/synctest. The code under test uses runtime.AddCleanup to allow my object to manage its own goroutine; this cleanup is akin to time.Timer having learned to clean itself up in Go 1.23.
package repro
import (
"runtime"
"testing"
"testing/synctest"
"time"
)
// NewReloadingCounter mirrors a typical "reloader" pattern: a background
// goroutine increments a counter until a done channel is closed. The done
// channel is closed by a cleanup registered on the lifetime of the returned
// getter function.
func NewReloadingCounter(period time.Duration) (get func() int64, stopped <-chan struct{}) {
var count int64
// Channel created inside the bubble. Under synctest, it is “bubbled”.
done := make(chan struct{})
stoppedCh := make(chan struct{})
get = func() int64 { return count }
// Cleanup tied to 'get'. When GC collects 'get', close the bubbled channel.
// Under synctest, cleanup runs outside any bubble, so this close panics:
// “close of synctest channel from outside bubble”.
runtime.AddCleanup(&get, func(done chan<- struct{}) { close(done) }, done)
go func() {
defer close(stoppedCh)
t := time.NewTicker(period)
defer t.Stop()
for {
select {
case <-done:
return
case <-t.C:
count++
}
}
}()
return get, stoppedCh
}
func TestCleanupPanicsInBubble(t *testing.T) {
synctest.Test(t, func(t *testing.T) {
get, _ := NewReloadingCounter(10 * time.Millisecond)
// Demonstrate useful behavior happens in-bubble (a tick can occur).
time.Sleep(10 * time.Millisecond)
synctest.Wait()
_ = get()
// Drop the last reference so cleanup can run, then trigger GC.
get = nil
runtime.GC()
// Allow cleanup a moment to run while bubble is alive; this triggers:
// panic: close of synctest channel from outside bubble
time.Sleep(1 * time.Millisecond)
})
}
Full repro here.
The actual code I was testing uses this pattern to vend a getter for a file that is frequently read but infrequently written, with the background goroutine refreshing from the filesystem. The ergonomics of the API preclude the usual defer-based cleanup.
What did you see happen?
panic: close of synctest channel from outside bubble
goroutine 36 [running]:
github.com/tamird/synctestrepro.NewReloadingCounter.func2(0x102a8eee0?)
/Users/tamird/src/synctestrepro/counter.go:29 +0x1c
runtime.runCleanups()
/Users/tamird/.goenv/versions/1.25.0/src/runtime/mcleanup.go:665 +0x210
goroutine 1 [chan receive]:
testing.(*T).Run(0x1400010a380, {0x1029224df?, 0x14000120b38?}, 0x102999408)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:2005 +0x378
testing.runTests.func1(0x1400010a380)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:2477 +0x38
testing.tRunner(0x1400010a380, 0x14000120c68)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:1934 +0xc8
testing.runTests(0x14000134018, {0x102a85fe0, 0x2, 0x2}, {0x14000138160?, 0x7?, 0x102a8eac0?})
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:2475 +0x3b8
testing.(*M).Run(0x14000124140)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:2337 +0x530
main.main()
_testmain.go:47 +0x80
goroutine 18 [runnable]:
fmt.(*pp).free(0x140001360d0?)
/Users/tamird/.goenv/versions/1.25.0/src/fmt/print.go:161 +0xdc
fmt.Sprintf({0x10291890b, 0x5}, {0x14000121b28, 0x1, 0x1})
/Users/tamird/.goenv/versions/1.25.0/src/fmt/print.go:241 +0x6c
testing.fmtDuration(0x140000828c0?)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:877 +0x8c
testing.tRunner.func1.2({0x102981bc0, 0x14000134060})
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:1866 +0xd0
testing.tRunner.func1()
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:1875 +0x31c
panic({0x102981bc0?, 0x14000134060?})
/Users/tamird/.goenv/versions/1.25.0/src/runtime/panic.go:783 +0x120
internal/synctest.Run(0x14000138020)
/Users/tamird/.goenv/versions/1.25.0/src/runtime/synctest.go:251 +0x2c4
testing/synctest.Test(0x140000828c0, 0x1029994a0)
/Users/tamird/.goenv/versions/1.25.0/src/testing/synctest/synctest.go:282 +0x88
github.com/tamird/synctestrepro.TestReloadingCounter_CleanupPanicsInBubble(0x140000828c0?)
/Users/tamird/src/synctestrepro/counter_test.go:33 +0x24
testing.tRunner(0x140000828c0, 0x102999408)
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:1934 +0xc8
created by testing.(*T).Run in goroutine 1
/Users/tamird/.goenv/versions/1.25.0/src/testing/testing.go:1997 +0x364
goroutine 21 [select (durable), synctest bubble 1]:
github.com/tamird/synctestrepro.NewReloadingCounter.func3()
/Users/tamird/src/synctestrepro/counter.go:37 +0xb4
created by github.com/tamird/synctestrepro.NewReloadingCounter in goroutine 20
/Users/tamird/src/synctestrepro/counter.go:32 +0x128
exit status 2
FAIL github.com/tamird/synctestrepro 0.205s
What did you expect to see?
I think it would be useful to loosen
Cleanup functions and finalizers registered with runtime.AddCleanup and runtime.SetFinalizer run outside of any bubble.
such that if the cleanup function or finalizer runs while the creating bubble exists, then the finalizer is considered to be running in that bubble.
Comment From: cagedmantis
Comment From: gabyhelp
Related Issues
- testing/synctest: bubble not terminating #74837 (closed)
- testing/synctest: be more explicit about goroutine leaks #75052
- testing/synctest: receive on synctest channel from outside bubble when not using synctest #73648 (closed)
- proposal: testing/synctest: create bubbles with Start rather than Run #73062 (closed)
- testing/synctest: Repeated sync.WaitGroup.Add appears flaky under synctest #74386 (closed)
- testing: race condition #74944 (closed)
- runtime: deleted timer was not cleaned up in time, causing the context chain saved in timerCtx to not be GCed in time #60144 (closed)
- runtime: scheduler sometimes starves a runnable goroutine on wasm platforms #65178 (closed)
- testing: "panic: Log in goroutine after Test..." is unreliable due to lack of synchronization on t.done #67701
Related Code Changes
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
Comment From: neild
A key concept in synctest is that of the bubble becoming quiescent--that is, being in a state where every goroutine in the bubble is blocked and can only be unblocked by some event arising from within the bubble. When a bubble becomes quiescent, synctest.Wait calls return or (if there is no Wait call) time advances.
If a cleanup function runs inside a bubble, it can unblock a goroutine within that bubble. That means the GC can wake a bubble, which requires adjusting our definition of quiescence. What would that new definition be? How would we implement it?
Comment From: tamird
If a cleanup function runs inside a bubble, it can unblock a goroutine within that bubble. That means the GC can wake a bubble, which requires adjusting our definition of quiescence. What would that new definition be? How would we implement it?
I suppose that definition would need to widened to include the GC; events inside the bubble which enable the GC to unblock goroutines are nevertheless inside the bubble. The new definition, I suppose, would be that cleanup functions and finalizers are scoped to the bubble in which they are created; the bubble's quiescence includes a guarantee that objects to which cleanup functions and finalizers were attached in the bubble have been freed if doing so if all references to them have been released.
I'd propose adding to Test's documentation which currently reads
Test waits for all goroutines in the bubble to exit before returning. If the goroutines in the bubble become deadlocked, the test fails.
that it also waits for all cleanup functions and finalizers in the bubble to have run.
As for how to implement it - I really don't know, I've never looked at the guts of the garbage collector.
Comment From: neild
events inside the bubble which enable the GC to unblock goroutines are nevertheless inside the bubble
What about events outside the bubble?
Attach a cleanup function to a pointer and pass that pointer to a goroutine outside the bubble. Drop the reference inside the bubble. Now that unbubbled goroutine can potentially cause a cleanup function to run inside the bubble by dropping its reference.
I struggle to see a way to make cleanups run inside a bubble that lets us identify when a bubble is quiescent (fundamental to synctest's operation) and doesn't make cleanups in a bubble behave completely differently from ones outside a bubble. Maybe we could have bubbled cleanups executed by the bubble's coordinating goroutine, for example, (so a cleanup which becomes runnable while a bubble is in the process of quiescing gets deferred until after the bubble wakes) but that's a substantial change to cleanup behavior.
There's also the question of what happens when a cleanup executes after the bubble has been destroyed. Do we skip running the cleanup? I suspect there are many use cases for cleanups that would find that a problem.
Comment From: tamird
What about events outside the bubble?
Attach a cleanup function to a pointer and pass that pointer to a goroutine outside the bubble. Drop the reference inside the bubble. Now that unbubbled goroutine can potentially cause a cleanup function to run inside the bubble by dropping its reference.
Right, that's analogous to the case of a channel created and passed to a goroutine outside the bubble and interacted with inside the bubble. In the case of the cleanup function, I suppose the cleanup would have to suspended during the execution of Test.
I struggle to see a way to make cleanups run inside a bubble that lets us identify when a bubble is quiescent (fundamental to synctest's operation) and doesn't make cleanups in a bubble behave completely differently from ones outside a bubble. Maybe we could have bubbled cleanups executed by the bubble's coordinating goroutine, for example, (so a cleanup which becomes runnable while a bubble is in the process of quiescing gets deferred until after the bubble wakes) but that's a substantial change to cleanup behavior.
This seems parallel to what is already done for goroutines/timers: ordinarily durable quiescence isn’t observable, but synctest does observe it. If we define “quiescent” to include “no runnable goroutines and no runnable bubbled cleanups,” then executing bubbled cleanups via the coordinator preserves determinism without observable behavior changes outside synctest.
There's also the question of what happens when a cleanup executes after the bubble has been destroyed. Do we skip running the cleanup? I suspect there are many use cases for cleanups that would find that a problem.
You mean a bubbled cleanup executing after the bubble has been destroyed? I would expect that to be considered illegal by making the waiting behavior of Test extend to bubbled cleanups.