#!watchflakes
post <- pkg == "runtime" && test ~ `TestSegv` && `schedule: holding locks`

Issue created automatically to collect these failures.

Example (log):

--- FAIL: TestSegv (0.00s)
    --- FAIL: TestSegv/SegvInCgo (0.01s)
        crash_cgo_test.go:656: /tmp/workdir/tmp/go-build2656727872/testprogcgo.exe SegvInCgo: exit status 2
        crash_cgo_test.go:657: fatal error: schedule: holding locks
            SIGSEGV: segmentation violation
            PC=0x3e4d5f m=4 sigcode=65537 addr=0x0

            goroutine 1 gp=0xc0000061c0 m=4 mp=0xc000088008 [syscall]:
            syscall.Syscall(0x25, 0x2af3, 0xb, 0x0)
                /tmp/workdir/go/src/syscall/asm_unix_amd64.s:19 +0x9 fp=0xc000065e20 sp=0xc000065e10 pc=0x3e4d49
...
            r13    0x1
            r14    0xc0000061c0
            r15    0x3
            rip    0x3e4d5f
            rflags 0x206
            cs     0x43
            fs     0x13
            gs     0x1b

        crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestSegv"
2024-01-31 22:40 freebsd-amd64-12_3 go@e069587c runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/SegvInCgo (0.01s) crash_cgo_test.go:656: /tmp/workdir/tmp/go-build2656727872/testprogcgo.exe SegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x3e4d5f m=4 sigcode=65537 addr=0x0 goroutine 1 gp=0xc0000061c0 m=4 mp=0xc000088008 [syscall]: syscall.Syscall(0x25, 0x2af3, 0xb, 0x0) /tmp/workdir/go/src/syscall/asm_unix_amd64.s:19 +0x9 fp=0xc000065e20 sp=0xc000065e10 pc=0x3e4d49 ... r13 0x1 r14 0xc0000061c0 r15 0x3 rip 0x3e4d5f rflags 0x206 cs 0x43 fs 0x13 gs 0x1b crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestSegv"
2024-01-31 22:40 linux-ppc64le-power9osu go@e069587c runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/TgkillSegvInCgo (0.03s) crash_cgo_test.go:656: /workdir/tmp/go-build3922838053/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x100071b4 m=3 sigcode=4294967290 addr=0x7fae74000010 goroutine 1 gp=0xc0000021c0 m=3 mp=0xc000043008 [running]: runtime/internal/syscall.Syscall6() /workdir/go/src/runtime/internal/syscall/asm_linux_ppc64x.s:18 +0x24 fp=0xc000051d40 sp=0xc000051d40 pc=0x100071b4 ... r22 0xc0000513a0 r23 0x1 r24 0x0 r25 0x7fae9347f8e0 r26 0x7fae9347ea70 r27 0x7fffc86765c0 r28 0x7fffc867630f r29 0x7fae9347f170 r30 0xc0000021c0 r31 0x100a8a28 pc 0x100071b4 ctr 0x0 link 0x1000717c xer 0x0 ccr 0x24424880 trap 0xc00 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: kolyshkin

Seeing this in 10+ different builds. The oldest one failing seems to be https://ci.chromium.org/ui/b/8757370110283072257

Comment From: kolyshkin

Possible cause is https://go-review.googlesource.com/c/go/+/546135

Comment From: kolyshkin

Cc @cherrymui

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestSegv"
2024-01-31 20:51 freebsd-arm64-dmgk go@5c0d0929 runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/SegvInCgo (0.05s) crash_cgo_test.go:656: /tmp/workdir-host-freebsd-arm64-dmgk/tmp/go-build2585557007/testprogcgo.exe SegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x495ea4 m=5 sigcode=65537 addr=0x0 goroutine 1 gp=0x40000021c0 m=5 mp=0x4000080008 [syscall]: syscall.Syscall(0x25, 0xee41, 0xb, 0x0) /tmp/workdir-host-freebsd-arm64-dmgk/go/src/syscall/asm_freebsd_arm64.s:15 +0x10 fp=0x400004ce00 sp=0x400004cdf0 pc=0x495e90 ... r26 0x0 r27 0x1 r28 0x40000021c0 r29 0x400004cde8 lr 0x495e90 sp 0x400004cdf0 pc 0x495ea4 fault 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestSegv"
2024-01-31 22:40 linux-mips64le-rtrk go@e069587c runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/TgkillSegvInCgo (0.06s) crash_cgo_test.go:656: /tmp/gobuilder-mips64le/tmp/go-build3499950964/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x12000675c m=3 sigcode=4294967290 addr=0x2014 goroutine 1 gp=0xc0000021c0 m=3 mp=0xc000041008 [running]: runtime/internal/syscall.Syscall6(0x1469, 0x2014, 0x2019, 0xb, 0x0, 0x0, 0x0) /tmp/gobuilder-mips64le/go/src/runtime/internal/syscall/asm_linux_mips64x.s:20 +0x24 fp=0xc00004fd90 sp=0xc00004fd90 pc=0x12000675c ... r20 0x7bd09b91bb9b8a2a r21 0x17624176b18554d8 r22 0x0 r23 0x120360000 r24 0x8 r25 0xfffc8cd310 r26 0x0 r27 0x0 r28 0x100000000 r29 0xc00004fd90 r30 0xc0000021c0 r31 0x12000670c pc 0x12000675c link 0x12000670c lo 0x120 hi 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: bcmills

We saw this in the LUCI runs on https://go.dev/cl/559796 as well (http://ci.chromium.org/b/8757365114293033025/test-results?sortby=&groupby=).

Comment From: bcmills

More spurious LUCI failures: http://ci.chromium.org/b/8757282631689012161/test-results?sortby=&groupby=

@golang/runtime, is there something we can revert quickly to get the TryBots back into a stable state?

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime" && test == "TestSegv"
2024-01-31 22:40 linux-ppc64le-power10osu go@e069587c runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/TgkillSegvInCgo (0.02s) crash_cgo_test.go:656: /workdir/tmp/go-build2592650088/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x100049b4 m=4 sigcode=4294967290 addr=0x7cc9693df530 goroutine 1 gp=0xc0000021c0 m=4 mp=0xc000043808 [running]: runtime/internal/syscall.Syscall6() /workdir/go/src/runtime/internal/syscall/asm_linux_ppc64x.s:18 +0x24 fp=0xc00004cd40 sp=0xc00004cd40 pc=0x100049b4 ... r22 0xc00004c3a0 r23 0x7cc96acce190 r24 0x7d1d582bd r25 0xa1fb63d07 r26 0x6f8171604 r27 0x7d066e1d r28 0xc579191f r29 0x173b55df r30 0xc0000021c0 r31 0x100a82b8 pc 0x100049b4 ctr 0x0 link 0x1000497c xer 0x0 ccr 0x24424488 trap 0xc00 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output
2024-01-31 23:21 linux-ppc64le-power10osu go@2f6a25f4 runtime.TestSegv (log) --- FAIL: TestSegv (0.00s) --- FAIL: TestSegv/SegvInCgo (0.03s) crash_cgo_test.go:656: /workdir/tmp/go-build315738768/testprogcgo.exe SegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x100821b8 m=7 sigcode=0 addr=0x7e8678000010 goroutine 0 gp=0xc0001021c0 m=7 mp=0xc000100008 [idle]: runtime.rtsigprocmask(0x2, 0x7e86819fe2e0, 0x0, 0x8) /workdir/go/src/runtime/sys_linux_ppc64x.s:404 +0x18 fp=0x7e86819fe2a0 sp=0x7e86819fe2a0 pc=0x100821b8 ... r22 0x7e86811feb38 r23 0x80e8e0 r24 0x4dd50db0 r25 0x10145ad0 r26 0x7e86832edd40 r27 0x0 r28 0x7e86819ff4bc r29 0x0 r30 0xc0001021c0 r31 0x1005e948 pc 0x100821b8 ctr 0x0 link 0x1005ebac xer 0x0 ccr 0x22424422 trap 0xc00 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output

watchflakes

Comment From: prattmic

This definitely looks like a recent regression. I'm taking a look for a potential cause.

Comment From: prattmic

This is reproducible with go test -run=TestSegv/TgkillSegvInCgo -count=10000 runtime (usually far less than 10k, but I wanted to be sure for the bisect).

Bisect blames https://go.dev/cl/546135.

I think the problem might be that we allow async preemption with locks held ((!canPreemptM(mp) && !noResume)), but asyncPreempt calls into the scheduler.

cc @cherrymui

Comment From: gopherbot

Change https://go.dev/cl/560455 mentions this issue: Revert "runtime: preempt more aggressively when panicking"

Comment From: gopherbot

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && test ~ `TestSegv` && `schedule: holding locks`
2024-01-31 20:26 gotip-linux-amd64-boringcrypto go@a5fb6560 runtime.TestSegv/TgkillSegvInCgo (log) === RUN TestSegv/TgkillSegvInCgo === PAUSE TestSegv/TgkillSegvInCgo === CONT TestSegv/TgkillSegvInCgo crash_cgo_test.go:656: /home/swarming/.swarming/w/ir/x/t/go-build2174216024/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x404f4e m=4 sigcode=18446744073709551610 addr=0x3e800007fb5 goroutine 1 gp=0xc0000061c0 m=4 mp=0xc00006b808 [running]: runtime/internal/syscall.Syscall6() ... r14 0xc0000061c0 r15 0x1 rip 0x404f4e rflags 0x212 cs 0x33 fs 0x0 gs 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output --- FAIL: TestSegv/TgkillSegvInCgo (0.05s)
2024-01-31 20:51 gotip-linux-amd64 go@5c0d0929 runtime.TestSegv/TgkillSegvInCgo (log) === RUN TestSegv/TgkillSegvInCgo === PAUSE TestSegv/TgkillSegvInCgo === CONT TestSegv/TgkillSegvInCgo crash_cgo_test.go:656: /home/swarming/.swarming/w/ir/x/t/go-build3106397352/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x404f4e m=4 sigcode=18446744073709551610 addr=0xe31a goroutine 1 gp=0xc0000061c0 m=4 mp=0xc00006b808 [running]: runtime/internal/syscall.Syscall6() ... r14 0xc0000061c0 r15 0xd rip 0x404f4e rflags 0x212 cs 0x33 fs 0x0 gs 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output --- FAIL: TestSegv/TgkillSegvInCgo (0.08s)
2024-01-31 20:51 gotip-linux-arm64 go@5c0d0929 runtime.TestSegv/TgkillSegvInCgo (log) === RUN TestSegv/TgkillSegvInCgo === PAUSE TestSegv/TgkillSegvInCgo === CONT TestSegv/TgkillSegvInCgo crash_cgo_test.go:656: /home/swarming/.swarming/w/ir/x/t/go-build2401246298/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x403ec0 m=5 sigcode=18446744073709551610 addr=0x0 goroutine 1 gp=0x40000021c0 m=5 mp=0x4000088008 [running]: runtime/internal/syscall.Syscall6(0x83, 0x6b391, 0x6b3a8, 0xb, 0x0, 0x0, 0x0) ... r27 0x703000 r28 0x40000021c0 r29 0x4000062d68 lr 0x403e7c sp 0x4000062d70 pc 0x403ec0 fault 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output --- FAIL: TestSegv/TgkillSegvInCgo (0.04s)
2024-01-31 22:40 gotip-linux-amd64-longtest-race go@e069587c runtime.TestSegv/TgkillSegvInCgo (log) === RUN TestSegv/TgkillSegvInCgo === PAUSE TestSegv/TgkillSegvInCgo === CONT TestSegv/TgkillSegvInCgo crash_cgo_test.go:656: /home/swarming/.swarming/w/ir/x/t/go-build4182855096/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x404f4e m=3 sigcode=18446744073709551610 addr=0x3e800037666 goroutine 1 gp=0xc0000061c0 m=3 mp=0xc00006b008 [running]: runtime/internal/syscall.Syscall6() ... r14 0xc0000061c0 r15 0x3 rip 0x404f4e rflags 0x212 cs 0x33 fs 0x0 gs 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output --- FAIL: TestSegv/TgkillSegvInCgo (0.02s)
2024-01-31 22:40 gotip-linux-amd64-staticlockranking go@e069587c runtime.TestSegv/TgkillSegvInCgo (log) === RUN TestSegv/TgkillSegvInCgo === PAUSE TestSegv/TgkillSegvInCgo === CONT TestSegv/TgkillSegvInCgo crash_cgo_test.go:656: /home/swarming/.swarming/w/ir/x/t/go-build2594685541/testprogcgo.exe TgkillSegvInCgo: exit status 2 crash_cgo_test.go:657: fatal error: schedule: holding locks SIGSEGV: segmentation violation PC=0x404f4e m=5 sigcode=18446744073709551610 addr=0x3e80000537e goroutine 1 gp=0xc0000061c0 m=5 mp=0xc000100008 [running]: runtime/internal/syscall.Syscall6() ... r14 0xc0000061c0 r15 0x9 rip 0x404f4e rflags 0x212 cs 0x33 fs 0x0 gs 0x0 crash_cgo_test.go:688: unexpectedly saw "fatal error: " in output --- FAIL: TestSegv/TgkillSegvInCgo (0.04s)

watchflakes

Comment From: cherrymui

Culprit CL reverted. Remove "Soon" label.