Proposal Details

We consider the frame pointer as a scratch register however the assembler and compiler toolchain avoids it at all cost (restoring SP using additions in function epilogues, making stack operations SP relative) and we still materialize the frame pointer, will not use it in register allocation.

All of this because tracing, profiling and debugging code relies on it being correct for backtraces. We make a point about fixing bugs related to frame pointer corruptions / fixing assembly routines using the frame pointer: https://github.com/golang/go/issues?q=is%3Aissue%20state%3Aclosed%20frame%20pointer%20in%3Atitle

Unless I am missing something these bugs are only found the hard way, so I propose that we add the checkframepointer debugging option, when enabled as part of all function epilogues the assembler will compare the frame pointer restored from SP addition with the frame pointer used by tracing / profiling / debugging code. If they do not match the program will throw (irrecoverably crash).

It's somewhat limited in the bugs it can catch, preemptive tools rely on the frame pointer being correct at any instruction in the program, but this is more expensive to check. You can fool function epilogue check by corrupting the frame pointer and then restoring it before returning.

This would allow us to run some buildbots with this option enabled and should catch some bugs before they get to users. Third party go modules with assembly routines can also use this in their tests and make sure they are not accidentally corrupting the frame pointer.

Comment From: Jorropo

cc @golang/compiler

Comment From: randall77

At one point I had a CL that used (x-framesize)(FP) instead of x(SP) for all local variable accesses. It would fail hard immediately if FP was ever corrupted. I can't seem to find it now, maybe lost in the ether. I used it to fix a bunch of stdlib assembly when we first enabled FPs.

That CL might be more effective in catching mid-function FP corruption. Hard to know for sure. Just another option for what such a check CL would look like.

Comment From: Jorropo

I have a CL that uses leave for compiled functions on amd64 and it segfaults all.bash (cmd/cgo/internal/test & runtime & os & internal/trace failures) on top of today's master so I guess it's still a problem.

For compiled code it's very unlikely an epilogue check would catch anything, but for assembly routines it's not uncommon ~~see #63508 for example~~ actually reading the issue it's not clear this is the problem.

Maybe this proposal shouldn't focus on how this is implemented, and we should allow ourself to implement mid-function checks in the future ? So the promise would be -d=checkframepointer MAY throw if at any point the frame pointer is unexpected.


I'll implement the epilogue checks and we can see if it catches something, it's a really short / easy thing to do.