@rsc says that watchflakes has code to detect broken builders. It would be nice to turn that awareness into issues, especially for breakages on x/ repos, where problems tend to go unnoticed for a long time.
cc @cherrymui, @bcmills
Comment From: bcmills
I suggest:
- If the main repo is broken, watchflakes
should automatically file a single issue for all affected builders, and automatically close the issue if/when all of those builders pass again.
- If only specific repo is broken, watchflakes
should automatically file a single issue for all affected builders, and automatically close it if/when they pass again.
- If watchflakes
posts more than N (5?) failures for a single builder, GOOS, or GOARCH in a 24h period, it should file an issue with a default
rule for the affected builder, GOOS, or GOARCH (to avoid spamming the issue tracker with duplicates). (#59379 is an example of such an issue.)
Comment From: bcmills
I just filed #61891 for a runtime/pprof
test failure introduced about a week ago. It would be nice not to have to file issues like that manually.
Comment From: bcmills
Some more examples of test failures that had to be reported manually:
- Subrepo failures in x/mobile
and x/pkgsite-metrics
due to the changes for #61035.
- Long-standing test failures in x/telemetry
.
- #62137
- #62138
As far as I can tell those failures were only noticed because I happened to look at the dashboard this morning.
Comment From: bcmills
More examples this week:
- https://go.dev/cl/527758 was broken on js
and wasip1
since last Wednesday (now fixed).
- The -longtest
builders have been broken on x/tools
since last Tuesday (#62703).
- The runtime
package has been failing two tests on the ios
builder since Sep. 8 (#62700, #62671)
- x/telemetry
has been persistently broken on js/wasm
(now fixed) and wasip1/wasm
(#62704) since Aug. 21.
Comment From: bcmills
x/build
has been broken onwindows/386
since June (#63243).
Comment From: bcmills
-
63794 also had to be reported manually, although fortunately it was only ~1 day after the breakage occurred.
Comment From: cherrymui
I found the failure shortly after I submitted the CL, and sent a CL to fix. I didn't file an issue.
Comment From: findleyr
Right now, there are several broken builders on the dashboard, some going undetected for over a week. It would be good to prioritize this issue for the next friction fixit.
Comment From: gopherbot
Change https://go.dev/cl/601439 mentions this issue: cmd/watchflakes: report consistent failures at top
Comment From: gopherbot
Change https://go.dev/cl/602036 mentions this issue: cmd/watchflakes: add the ability to query for broken bots
Comment From: dmitshur
There's a recent case where watchflakes doesn't seem to be reporting a consistent failure. Filed #68753 for it.