As the number of Go implementations continues to increase, the number of cases where the unsafe package is unlikely to work properly also rises. Currently, there is appengine, gopherjs, and possibly wasm where pointer arithmetic is not allowed.

Currently, protobuf and other packages special cases build tags for appengine and js and may need to add others in the near future. It does not scale very well to blacklist specific known Go implementations where unsafe does not work.

My proposal is to document safe as a community agreed upon tag meaning that unsafe should not be used. It is conceivable that this concept be extended to the compiler rejecting programs that use unsafe when the safe tag is set, but I'm currently more interested as a library owner in knowing whether to avoid unsafe in my own packages.

\cc @zombiezen @dneil @neelance @shurcooL

Comment From: dsnet

The code history of protobuf seems to indicate that this very same concept was discussed but not pursued further. I'd like to push this more since I see this distinction in at least 2 packages I own.

https://github.com/golang/protobuf/issues/154

Comment From: mdempsky

I think standardizing a build tag to indicate whether package unsafe is available makes sense.

It is conceivable that this concept be extended to the compiler rejecting programs that use unsafe when the safe tag is set

I disagree. Currently, build tags are strictly a build-system concept. I'd argue the compiler should remain ignorant of them. cmd/compile already has a -u flag that prevents importing package unsafe, and the build system can arrange to pass -u as appropriate.

Comment From: dmitshur

/cc @bradfitz who also ran into this with go4.org/reflectutil, and seemed to like "safe" at the time.

Comment From: bradfitz

@shurcooL, I'm still fine with "safe", as long as it's defined (i.e. "code that doesn't import "unsafe").

But does it also mean no assembly?

Those are the sorts of things that should be clarified, if this is to be blessed somehow. (our own use, wiki page, etc)

Comment From: dmitshur

@dsnet Can you clarify if your proposal is about documenting safe to have a very specific meaning and applied to all packages?

Or is it about documenting the fact that safe is a commonly used build tag for a given purpose, but individual projects still have get final say on the exact meaning of the safe build tag for their own needs?

Comment From: flimzy

Would this proposal be codified in the standard library somehow, perhaps by adding a !safe build tag to the unsafe package? Or would it live purely in documentation?

Comment From: davecb

In a previous life, we had to identify versions of libraries, and rapidly found out that that was too coarse a measure. We eventually attached a label via the linker to each entry point*, and could tell if, for example, a call to memcpy allowed overap or not. '

We also used it in migration work, to identify parts of programs that could not be supported on a different OS or hardware platform.

You arguably should consider labelling parts of the unsafe library with supported and unsupported by target OS, language or whatever, not the whole library if only one operation is unavailable.

--dave [* a description of using per-entry-point labels for a different purposes is at https://leaflessca.wordpress.com/2017/02/12/dll-hell-and-avoiding-an-np-complete-problem/ ]

Comment From: andlabs

Would this unsafe build tag also affect code that uses cgo? SWIG?

How would this build tag interact with the standard library, where both are used a lot? Does no unsafe mean no reflect as well?

What is the unsafe policy on nacl? is there anything in nacl that we could use for this?

@shurcooL it sounds like the latter at minimum, the former ideally.

Comment From: dsnet

I propose that "safe" be soft signal that a library should have memory safety (i.e., makes no assumptions about how objects are laid out in memory, the architecture endianess, semantics regarding registers, etc).

Thus, the "safe" tag has the following properties: * This is just a hint for library authors who want to write code that is highly portable. There is no logic in the compiler or the build tool to enforce this. * Thus, the standard library doesn't have to use it. Portability is achieve by either requiring a fork of the standard library (as is the case for gopherjs) or via a series of build tags as the mainline standard library does for various architectures. * Use of reflect is allowed since it doesn't allow you to violate memory safety. * Use of cgo is allowed. We already have a build tag for that, which is cgo. In practice, safe implies that cgo is not used since it is difficult to use cgo without pointers (for which you need unsafe). * Use of assembly is forbidden. Any reasonable use of assembly makes assumption about how memory is laid out on the stack and/or heap.

Thus, appengine and gopherjs are example toolchains that would always set the safe tag.

Comment From: rsc

Based on discussion with proposal-review:

  • It seems reasonable for appengine, gopherjs, and wasm, all of which declare their own "restricted build" tag, to agree on a common one.
  • It should mean no asm, no cgo, no unsafe. (It's impossible to use cgo without unsafe.)
  • Nothing in the standard distribution would care; this is really about coordination between these other non-standard environments. Where do you propose to document this?
  • A better name than "safe" would be nice. "purego"?

Comment From: neelance

On WebAssembly: The wasm backend that I'm working on uses a linear memory, so unsafe is fully supported. It also has its own asm instructions, just like other architectures. Cgo is not supported (unless someone wants to do a crazy integration with emscripten).

Comment From: cznic

A better name than "safe" would be nice. "purego"?

:+1: for purego, already using it in my projects for some years.

Comment From: dsnet

I support purego as well. Even if WebAssembly supports unsafe (which is great to hear!), there is always still the use case where someone wants to compile with a pure-Go version for a variety of reasons.

I don't have any great suggestions for where to document this, but perhaps the godoc for go/build?

Comment From: dmitshur

I wanted to point out that in colloquial usage, I've seen "pure Go" most commonly refer to packages that don't use cgo (but can use unsafe, assembly). Seeing it mean "no unsafe and no assembly as well" would require some calibration. But maybe it's fine.

The math/big package contains some precedent on this: it defines a math_big_pure_go tag, which is being used as proposed here (no assembly, no unsafe, no cgo).

Comment From: rsc

OK, purego it is.

Comment From: gopherbot

Change https://golang.org/cl/103239 mentions this issue: go/build: document purego convention

Comment From: cristaloleg

Looks like the patch is still not merged, there is 1 small suggestion, kindly ping @dsnet 👀

Comment From: cristaloleg

@rsc can this be accepted and merged with #41184 ? So the new //go:build will consolidate community on 1 safety oriented tag (which is purego based on this issue). Thanks.

Comment From: cespare

It's unfortunate that the best reference for this convention is still this issue, since in the intervening four years no documentation change has been merged.

Additionally, I have found during this time that, in practice, I cannot use the purego tag for its intended purpose in our internal codebase at my company. The reason is that we use go-cmp. Some of go-cmp's core functionality is unsafe. That functionality is now behind a purego build tag. Enabling the purego tag makes most of our tests panic. We have packages with asm as well as pure-Go implementations, and we often want to run automated tests of both code paths, but we cannot run the tests with purego, so we end up using a different build tag to indicate "Go rather than assembly".

I'm not even sure what the right fix is. Maybe the ideal outcome would be that something in the Go standard library (testing? reflect?) would allow go-cmp to do what it needs to do without unsafe. But for now, the existence and popularity of go-cmp kind of "infects" the purego tag and makes it a not-very-useful convention.

Comment From: zephyrtronium

I'm not even sure what the right fix is. Maybe the ideal outcome would be that something in the Go standard library (testing? reflect?) would allow go-cmp to do what it needs to do without unsafe.

Perhaps also relevant: #45200

Comment From: kortschak

I'd like to clarify, based on discussion here, whether this should even be a thing. I know that people want it, but it looks like in that discussion it isn't being considered as having any great importance.

Comment From: gopherbot

Change https://go.dev/cl/561935 mentions this issue: crypto: use and test purego tag consistently

Comment From: aykevl

I'm working on TinyGo, and combining these three concepts together seems like a bad idea to me (no assembly, no cgo, and no unsafe). TinyGo: * Supports unsafe, though it has a slightly different memory layout for some types. It should still be possible to write portable unsafe code in most cases (e.g. unsafe.String was a great addition for this). * Supports CGo, though it has some features missing compared to the main Go toolchain. These missing features can probably be implemented when needed. * Does not support Go assembly. I've tried it, and the only way I got it to work was one giant hack.

Furthermore, there is already a perfectly good tag for CGo support: the cgo build tag. I don't think we need a new build tag that also says something about CGo support. I see there are some systems (like appengine) where unsafe is not allowed, I would suggest using a different build tag for that than the one that controls Go assembly support.

Right now we set the purego build tag by default to get crypto packages to work, but I don't really like it because it's not very clearly defined right now and I'd rather not limit things like unsafe. A build tag like noasm and a separate build tag like nounsafe (or whatever) would be much better in my opinion.

Comment From: gopherbot

Change https://go.dev/cl/660136 mentions this issue: cmd/go: document purego convention

Comment From: FiloSottile

I think we might have decided this wrong. Banning all uses of unsafe under the same build tag as assembly is overly broad.

There are really at least three classes of unsafe: linknames, type conversions (especially now with unsafe.String), and pointer arithmetic. AFAICT only the last one is really non-portable.

On top of https://github.com/golang/go/issues/23172#issuecomment-1000544013 and https://github.com/golang/go/issues/23172#issuecomment-2000390548, which make compelling arguments, CL 657297 made me notice that a low-level package (hash/maphash) depends on crypto/rand under purego because it wanted to avoid a linkname to runtime.rand, while the ask in #47342 was to avoid pointer arithmetic.

Maybe we should rescope purego to only banning assembly and "non-portable" unsafe, leaving the cgo tag and CGO_ENABLED for cgo. What's non-portable unsafe is fuzzy, but after all non-gc implementations are always balancing downstream patches and upstream conveniences, so they will let us know like in #47342. (We certainly can't make the whole standard library build without unsafe.)

Comment From: seankhliao

If it's about documenting convention, there are a lot more examples of: //go:build !purego + import "unsafe": 108: https://github.com/search?q=NOT+is%3Aarchived+NOT+is%3Afork++language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.%21purego%2F+%2F%22unsafe%22%2F&type=code vs //go:build purego + import "unsafe": 14: https://github.com/search?q=NOT+is%3Aarchived+NOT+is%3Afork+language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.%5B%5E%21%5Dpurego%2F+%2F%22unsafe%22%2F&type=code

Related, there seems to be a trend of using some other build tag to guard unsafe, though I don't think there's a strong consensus on safe, unsafe, nounsafe, or something else 71: https://github.com/search?q=NOT+is%3Aarchived+NOT+is%3Afork+language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.safe%2F+%2F%22unsafe%22%2F&type=code however it doesn't seem to combine often with purego: 4: https://github.com/search?q=++NOT+is%3Aarchived+NOT+is%3Afork++language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.safe.purego%2F+%2F%22unsafe%22%2F&type=code + 2: https://github.com/search?q=++NOT+is%3Aarchived+NOT+is%3Afork++language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.purego.*safe%2F+%2F%22unsafe%22%2F&type=code

Total hits for purego (excluding !purego): 345: https://github.com/search?q=NOT+is%3Aarchived+NOT+is%3Afork+language%3Ago+%2F%5C%2F%5C%2Fgo%3Abuild+.*%5B%5E%21%5Dpurego%2F&type=code

Comment From: aclements

Maybe part of the problem here is that we don't have a well-defined boundary of what packages are or are not portable. E.g., this comes up in maphash, which I would argue is tightly coupled with the runtime. Another Go implementation with a different runtime would also have to define its own maphash package. But a lot of the packages in std are portable and only make use of exported APIs. Today we don't draw that boundary.

Comment From: aclements

On top of https://github.com/golang/go/issues/23172#issuecomment-1000544013 and https://github.com/golang/go/issues/23172#issuecomment-2000390548, which make compelling arguments, CL 657297 made me notice that a low-level package (hash/maphash) depends on crypto/rand under purego because it wanted to avoid a linkname to runtime.rand, while the ask in https://github.com/golang/go/issues/47342 was to avoid pointer arithmetic.

IMO, maphash is part of the runtime, and therefore does not need to have a purego implementation, in the same way that the runtime package itself clearly cannot have a purego implementation.

My understanding is that GopherJS depends on the purego implementation of maphash, but I think they could easily provide runtime_rand and runtime_memhash, just as the current purego version of maphash does, and otherwise continue to use the existing maphash package. We could then drop the purego implementation of maphash.

Comment From: rolandshoemaker

It sounds like what we want (correct me if I'm wrong) is for purego to mean "portable Go". That probably means something along the lines of you cannot use assembly, nor probably cgo.

What from unsafe you can use is complicated. Sizeof is maybe fine (although I wonder about host specific alignment/padding stuff for structs)? I think clearly AlignOf and OffsetOf are probably out the window. The others I'm not really sure of either way.

Comment From: aclements

I filed a separate proposal for dealing with the problems caused by purego maphash: #74285.

Comment From: cherrymui

Based on the discussion above, it seems still unclear what the meaning of purego people expect, and how the tag would be used. Sometimes it is meant for other (non-gc) implementations of the Go distribution (note that a non-gc Go distribution could support cgo and unsafe, e.g. gccgo). Sometimes it might mean "safe"? And sometimes it could mean "portable Go" as @rolandshoemaker mentioned above.

In the standard library, besides hash/maphash, the purego tag is used in crypto packages mostly to provide a generic fallback, to support platforms that don't have the assembly implementation. According to @rolandshoemaker , it is unclear how/whether the tag is used on a platform that does have assembly support, in which case the GOARCH build tag would just do the same thing.

Could someone who actively uses the purego tag in their code comments on what the intention is? Thanks.

Comment From: FiloSottile

The systematic use of purego in crypto is for two purposes: TinyGo (who previously unnecessarily had a number of crypto packages marked as broken), and testing generic fallbacks on dev machines (which are always arm64 or amd64 which generally has assembly).