Proposal Details
Currently stack exhaustion in Go is a fatal error triggered by throw in newstack, i.e. non-recoverable. In many (most?) other languages, the equivalent condition is recoverable using standard error handling methods. Go's behavior is a source of many security issues. E.g. just in the standard library we have:
ID | Package | Description |
---|---|---|
Issue #10415 | encoding/gob | stack overflow |
Issue #15879 | path/filepath | Glob with UNC path causes stack overflow |
CVE-2022-1962 | go/parser | stack exhaustion in all Parse* functions |
CVE-2022-24675 | encoding/pem | fix stack overflow in Decode |
CVE-2022-24921 | regexp | stack exhaustion compiling deeply nested expressions |
CVE-2022-28131 | encoding/xml | stack exhaustion in Decoder.Skip |
CVE-2022-30630 | io/fs | stack exhaustion in Glob |
CVE-2022-30631 | compress/gzip | stack exhaustion in Reader.Read |
CVE-2022-30632 | path/filepath | stack exhaustion in Glob |
CVE-2022-30633 | encoding/xml | stack exhaustion in Unmarshal |
CVE-2022-30635 | encoding/gob | stack exhaustion in Decoder.Decode |
CVE-2024-34155 | go/parser | stack exhaustion in all Parse* functions |
CVE-2024-34156 | encoding/gob | stack exhaustion in Decoder.Decode |
CVE-2024-34158 | go/build/constraint | stack exhaustion in Parse |
In similarly complex third-party libraries these issues are likely even more common. Consumers of such libraries are unable to defend against crashes because the error is fatal.
The proposal is to make stack exhaustion a panic instead of a fatal error. Technically this would involve ensuring that enough stack is always available to handle panicking, which is currently not the case.
Comment From: mateusz834
Technically this would involve ensuring that enough stack is always available to handle panicking, which is currently not the case.
Is that possible? What if the deferred function does recursion or an indirect call.
Comment From: jupenur
Is that possible? What if the deferred function does recursion or an indirect call.
By reserving some fixed amount of space on the stack, no, because it's gopanic
itself that calls the deferred functions directly up until recovery. Instead, gopanic
would have to be redesigned so that it unwinds the stack as it goes.
Comment From: Jorropo
I agree that unwinds would solve this for almost all usecases.
We can also do: - Stack overflow during normal operation → panic - Stack overflow during panic → throw
This gives us a way to bound how much memory stacks can use while implementing this proposal.
Comment From: Jorropo
From my memory the default stack limit is 1GiB.
I wonder if recovering from a stack overflow really help against DOS.
Assuming someone used recursion to parse untrusted user input, right now someone can send a highly nested payload causing a throw. If this were recovered panic they could use binary search to tune the size of the payload to use as much stack bellow the limit causing the stack to grow to 1GiB, because of our in-aggressive stack shrinking strategy repeat that with a couple of concurrent requests and you could hold many GiBs of memory with a few hundreds of MiBs payload.
This hypothetical vulnerability is less bad with this proposal implemented but it isn't a golden bullet that let you parse untrusted user input using recursion without a recursion limit or an iterative (non recursive) approach either.
Comment From: jupenur
repeat that with a couple of concurrent requests and you could hold many GiBs of memory with a few hundreds of MiBs payload
Where high memory consumption is expected, this would likely apply equally to the non-recursive approach. So concurrency would have to be limited anyway to mitigate DoS. The typical setup is to use worker goroutines for risky and potentially long-running parsing tasks and cap the number of such goroutines at any given time to some predetermined maximum. The maximum stack size itself is also configurable via runtime/debug.SetMaxStack
.
I do agree that it's not a golden (silver?) bullet, though.
Comment From: jupenur
We wrote a blog post with some additional context and motivation for this proposal here.
Comment From: adonovan
This is an interesting proposal. While Go's growable stacks allow parsers to consume as much stack as they need even for deeply nested valid inputs--a challenging input for parsers on fixed-size stacks--still nothing stops a maliciously crafted string of (((...
from overflowing any finite limit. Defending against this in any parser is a fiddly task as the OP's blog post amply demonstrates, and is typically only done after learning about its absence the hard way.
If stack overflows could be recovered, parsers could also use transient set/restore calls to SetMaxStack to limit the recursion depth in a single code location without widespread depth bookkeeping, thus imposing a smaller limit than the full 1GB (which is usually too generous).
Comment From: mateusz834
If stack overflows could be recovered, parsers could also use transient set/restore calls to SetMaxStack to limit the recursion depth in a single code location without widespread depth bookkeeping
Isn't SetMaxStack
a global setting? It will affect all goroutines. Libraries should not call it. Also i think it would require having access to the current stack size, since you can't just SetMaxStack(1MiB)
, since you don't know the amount of stack used till this call. It would have to be something like: SetMaxStack(CurStackUse()+1MiB)
.
Also it might fall apart if the current stack is already huge at the SetMaxStack
call, it would not shrink it, which will allow more functions to be called than intended.
Comment From: adonovan
Isn't SetMaxStack a global setting?
You're right, I completely misread it. Please interpret my comment instead as a wish for a means of temporarily reducing the current goroutine's additional stack headroom to k (iff larger).
Comment From: randall77
If we do this, I don't want to specify anything in bytes. Different architectures will have different frame sizes for the same function, so a byte limit would mean a different number of frames on different architectures.
Right now that's not a huge deal because the stack overflow detection is outside the language spec. Making it panic
puts it in the language spec.
I'd be happier with a frame count limit. That's easier to reason about from a programmer's point of view. (And we might be able to adjust for things like inlining automatically.)
Comment From: TapirLiu
All allocation failures in Go are unrecoverable now. Any differences between stack exhaustion and heap exhaustion in the situation?
btw, @Jorropo the default stack limit is 512Mib on 64-bit arch. Because the constant for the limit is 1GB, but the stack size must be 2^n
.
Comment From: ianlancetaylor
In general we don't want standard library packages to panic into the calling code. In general code calling into the standard library should not assume that it can recover an unexpected panic. An unexpected panic means that the standard library state may be arbitrarily corrupted.
So I don't see how changing this will affect these security issues. They are security issues whether we use panic or an unrecoverable failure. In any case where the standard library can use up the stack, we should fix the standard library.
So I don't see the point of this change.
Setting that aside, the language requires that deferred functions be run on the stack at the point of the panic. We can't change that. So panicking in general requires an unknown amount of stack space. So I don't see how to implement this safely. Crashing the program if a deferred function needs more stack space than is available just makes the failure case less likely and therefore harder to test for. That seems like a bad choice.
A slightly different idea that might be helpful in some cases, as @adonovan hinted at, would be
package runtime
// SetStackLimit sets the maximum function call depth of this goroutine. A value of 0, which is the default,
// means that there is no limit. If the function call depth is exceeded, the goroutine will panic with a value
// of type [StackLimitError].
//
// This setting does not override the overall stack size limit set by [debug.SetMaxStack]. If that stack size limit
// is exceeded, the program will crash.
//
// This function may be used by, for example, parsers that want to avoid deep recursion while parsing user code.
// Normally a parser using this should recover any panic and convert the [StackLimitError] panic into an ordinary error.
//
// SetStackLimit returns the previous limit.
func SetStackLimit(limit int) int
// StackLimitError is the type of a value passed to panic if the limit set by [SetStackLimit] is exceeded.
// StackLimitError implements [runtime.Error].
type StackLimitError struct { // unexported fields }
But even if this approach seems like a good idea, it has a problem: it's hard to implement efficiently. Maybe if a goroutine calls SetStackLimit we could set the stack guard to always indicate an overflow, and use that to track the function call depth.