The hugo binary gets slower, potentially dramatically so, with GOEXPERIMENT=greenteagc. The root cause is page mapping churn. The Green Tea code introduced a new implicit nil check on value in a freshly-allocated span to clear some new heap metadata. This nil check would read the fresh memory, causing Linux to back that virtual address space with an RO page. This would then be almost immediately written to, causing Linux to possibly flush the TLB and find memory to replace that RO page (likely deduplicated as just the zero page).

The full investigation is in #73581 and culminates with https://github.com/golang/go/issues/73581#issuecomment-2979062004.

Comment From: gopherbot

Change https://go.dev/cl/684015 mentions this issue: runtime: make explicit nil check in (*spanInlineMarkBits).init

Comment From: mknyszek

Big thanks to @prattmic for working with me to figure this out (and showing me a lot about how to use perf for probing the kernel).

Also big thanks to @bep for reporting the issue, and for reporting with a reproducer. That made it much faster to iterate and ultimately get to the bottom of what turned out to be a very subtle and gnarly issue.