Go applications compiled to WASM with the browser as target are quite inefficient and large. Both the download payload and performance could be improved by implementing support for WASM-GC.

Chrome recently enabled WASM-GC https://developer.chrome.com/blog/wasmgc/

Comment From: dominictobias

Another good article: https://v8.dev/blog/wasm-gc-porting

Comment From: xushiwei

This is what I want most in future Go version iterations.

Comment From: syrusakbary

Once Go ships with Wasm-GC support, the binaries produced should be super small... eager to see how the Go team progresses on this!

Comment From: ianlancetaylor

CC @golang/wasm

Comment From: johanbrandhorst

After looking briefly at various articles about this, I expect this will be an enormous amount of work, almost comparable to writing an entirely separate compiler. There would have to be special cases at many levels within the compiler. I would also love to see this, but I think it will be hard without considerable effort.

Comment From: mknyszek

+1 to what @johanbrandhorst said. Another big blocker to this is the fact that the Wasm GC, IIUC, doesn't yet support interior pointers, which are ubiquitous in Go code.

Comment From: daveshanley

Who do I need to pay to make this happen?

tinygo is great, but cannot handle anything beyond basic go apps.

Please. God. This.

Comment From: evanphx

I think folks are presuming that WASM-GC would mean that the generated programs would change a lot. What changes are folks presume? From the comments above, there is the perception that it would make the binaries smaller, but that's not the case. The code to run the GC is tiny in compared to the rest of the program.

Comment From: daveshanley

I am ignorant, so please excuse that fact, but wouldn't completely removing the 1.5mb+ of additional GC golang bulk, only reduce WASM size?

Comment From: evanphx

Where are you seeing this 1.5MB?

Comment From: Malix-Labs

@daveshanley removing the Go GC from the WASM build will indeed obviously reduce its size, but I have found no information about the "1.5mb+" GC size allegation

Comment From: omar391

bump: any plan for this compiler feature

Comment From: evanphx

@omar391 No plans at present, the model of WASM-GC isn't compatible with the Go language.

Comment From: Malix-Labs

@evanphx could you provide reference links to documentation about that?

Comment From: mihaiav

wasm is fortunately still work in progress. If someone is brave enough feel free to submit a proposal for interior pointers

Comment From: stephanwesten

I just read this article: https://web.dev/case-studies/google-sheets-wasmgc

Seems that Java, Kotlin, Dart and Flutter are making progress. Bit worried that Go is getting behind….

Comment From: mihaiav

Make Java/JVM great again!Sent from my iPhoneOn 27 Jun 2024, at 23:15, Stephan Westen @.***> wrote: I just read this article: https://web.dev/case-studies/google-sheets-wasmgc Seems that Java, Kotlin, Dart and Flutter are making progress. Bit worried that Go is getting behind….

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

Comment From: glycerine

Current compiled Go code assumes that the garbage collector (GC) will not move memory around. However the WasmGC proposed spec allows for compacting GC that moves memory around; thus this will break most Go code as presently compiled.

In fact, https://v8.dev/blog/trash-talk from 2019 says that the Orinoco garbage collector in the v8 javascript engine (in Chrome) is a moving GC, so this is not a theoretical issue, but a real and present one. Quoting that blog about Orinoco:

"The major GC also chooses to evacuate/compact some pages, based on a fragmentation heuristic. You can think of compaction sort of like hard-disk defragmentation on an old PC. We copy surviving objects into other pages that are not currently being compacted (using the free-list for that page). This way, we can make use of the small and scattered gaps within the memory left behind by dead objects."

Moreover, the current large wasm binaries from Go code I think are mostly attributable to including the fmt and other large standard library packages, and not to the Go runtime GC code; as mentioned above.

In addition, there is no interior pointer support at present in the WasmGC Chrome MVP implementation, which would mean that a whole new memory layout strategy would be needed for arrays that contains structs, and structs that contain structs whose addresses are taken.

Current Go code that pins memory using runtime.Pinner is likely to never work on WasmGC, as pinning is not even on the post-MVP feature list for WasmGC ( https://github.com/WebAssembly/gc/blob/main/proposals/gc/Post-MVP.md ).

For these reasons, it may be best to focus on using/improving the current Wasm support in the Go toolchain rather than holding your breath for WasmGC support. It looks like a ton of work, and it likely won't shrink binaries by much anyway.

Comment From: Splizard

I don't really understand the concerns about interior pointers, wouldn't you just use a fat pointer, address+offset to represent these?

Comment From: ianlancetaylor

@Splizard It's not feasible to use a different type for pointers to the start of an object and interior pointers, so that approach would require that all pointers be fat pointers. That is probably doable in principle, but would be quite a lot of work in practice.

Comment From: glycerine

I was curious about where the space in the hello world .wasm file does get used. A basic hello world from the 1.22 Go toolchain is about 2MB on my darwin laptop.

So I did the following admittedly very crude analysis. My rough conclusion is that runtime takes up about 75% of the space, and of that the garbage collection is 10%. So, again very roughly, not including garbage collection routines would save about 7.5% of function space in the .wasm binary. (edit: multiplied by the 62% of func bytes make up, this would really be ~ less than 5% savings).

I'd welcome much more rigorous means of doing this analysis. Obviously I've used some quick and dirty approximations, simply because I don't know what tools are available to do any better. If you'd like to improve on it, please do, so I know how. Here is how I did my crude analysis.

$ go version
go version go1.22.4 darwin/amd64
$ cat main.go
package main

import "fmt"

func main() {
    fmt.Println("Hello wasip1.")
}

$ GOOS=wasip1 GOARCH=wasm go build -o main.wasm main.go

$ ls -al main.wasm 
-rwxr-xr-x  1 me  staff  2112611 Oct 21 11:07 main.wasm
$ ls -alh main.wasm 
-rwxr-xr-x  1 me  staff   2.0M Oct 21 11:07 main.wasm

$ # (install github.com/WebAssembly/wabt for analysis)

$file ./main.wasm
./main.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)

$ wasmtime ./main.wasm
Hello wasip1.
$ wasm-objdump -x main.wasm > out.objdump

$ cat analyze.py

import re
from collections import defaultdict

# Read the wasm-objdump output from a file
with open('wasm-objdump-output.txt') as f:
    lines = f.readlines()

# Pattern to match function lines (example: 'func[0] size=120 <runtime.main>')
func_pattern = re.compile(r'func\[\d+\] size=(\d+)\s+<([^>]+)>')

# Dictionary to store cumulative size per package
package_sizes = defaultdict(int)

# Parse the lines
for line in lines:
    match = func_pattern.search(line)
    if match:
        size = int(match.group(1))
        func_name = match.group(2)

        # Extract the package from the function name (e.g., runtime.main -> runtime)
        package = func_name.split('.')[0]

        # Add the size to the corresponding package
        package_sizes[package] += size

# Print the results
for package, total_size in package_sizes.items():
    print(f"Package {package} uses {total_size} bytes")

$ python3 analyze.py |sort -nk 4
Package go_buildid uses 4 bytes
Package _rt0_wasm_wasip1 uses 21 bytes
Package wasm_pc_f_loop uses 42 bytes
Package cmpbody uses 60 bytes
Package internal_bytealg uses 69 bytes
Package memeqbody uses 69 bytes
Package memcmp uses 77 bytes
Package gcWriteBarrier uses 90 bytes
Package callRet uses 123 bytes
... (omit lots of little stuff)
Package os uses 8309 bytes
Package syscall uses 14588 bytes
Package internal_poll uses 16091 bytes
Package type_ uses 16444 bytes
Package internal_fmtsort uses 17271 bytes
Package sync uses 27732 bytes
Package strconv uses 52924 bytes
Package reflect uses 68515 bytes
Package fmt uses 87892 bytes
Package runtime uses 953200 bytes

$ cat out.objdump |grep func|grep size|grep runtime
978345
$ cat out.objdump |grep func|grep size|grep runtime|grep gc|sed 's/size=/size\ /'|awk '{sum+= $4} END {print sum}'
97699
$ ls -al main.wasm 
-rwxr-xr-x  1 me  staff  2112611 Oct 21 11:07 main.wasm
$ python3 analyze.py |sort -nk 4 | awk '{sum+= $4} END {print sum}'
1310313  ## so the lines with "func" and "size" are accounting for 1310313/2112611 = 62% of the bytes in main.wasm
$

# So very roughly, a crude, back-of-the-envelope estimate suggests:
#
# Of the sized func sections, "runtime" is the largest, at 978345 / 1310313 = 74% of function bytes.
#
# Out of those, the runtime gc func are 97699 / 978345 = 10% of runtime, and 97699 / 1310313 = 7.5% overall.
#
# Hence we can conlude, very roughly, that not including the Garbage collection
# routines would reduce the .wasm binary size by about 7.5%.

Comment From: hajimehoshi

https://web.dev/blog/web-platform-12-2024?hl=en#webassembly_garbage_collection_and_tail_call_optimization Now Wasm GC has become baseline.

Comment From: stephanwesten

This all might get a bit unfortunate for Go. Allow me to share my thoughts, admittedly lots of speculation:

Engineers are moving from EC2 to K8s (go blog). K8s is complex, need expensive SREs to keep it running. Whilst at the same time there is a need for inexpensive containers across clouds; a mesh covering from EKS/GCP to CDN providers (e.g. CF Workers), to AI clouds, to bare metal or Matter/pi at home. You can’t do this easily with K8s. So a new orchestration engine with build-in networking mesh capabilities will arise. Wasm is a good candidate as default container as opposed to Docker based. It’s light, secure, spins up fast, etc.

I recently tried TinyGo to get the binary size down. I added one library and it exploded in size and wouldn’t work, some (tiny?) piece of reflection was used somewhere. It also compiles way too slow. So small wasm binaries with the default Go is pretty important, otherwise Go will remain on the big clouds and that part will get less important.

Comment From: omar391

I can see 3 possible ways out so far - Update Go compiler without interior pointer specifically for wasm part ( difficultly level: Equivalent to writing a new compiler) - Add "interior pointer" proposal to wasm itself ( difficultly level: well, it takes time to get anything approved for wasm - which is good ) - Write an entirely new compiler for go focusing wasm ( well, tinygo also need to change like main go. so not an option) ( difficultly level: yeah for brave hearts only)

Comment From: glycerine

@stephanwesten @omar391 I'd like to see wasm usable as a plugin mechanism, but wasm itself does not (yet? will it ever?) even have thread support. Fine for a plugin or a bit of browser code, but should compute clusters (k8s) sacrifice one of the major points for using Go in the first place (sane multicore)? This seems a non-sensical move backwards.

Alternatives? You could get a faster (works today; supports threads) secure solution by going back and using the PNaCl/NaCl support from go1.13.15.

I have no insider knowledge, but my impression is that PNaCl/NaCl was pulled solely for political rather than technical reasons. A small piece of evidence for that thesis is that security focused academics continue to deploy it:

https://github.com/Lind-Project/lind-docs / https://github.com/Lind-Project/lind_project is security conscious container project from Justin Cappos's Secure Systems Lab (https://ssl.engineering.nyu.edu/) at NYU. It uses NaCl and a Rust-based "micro-visor".

You could also look at gvisor (https://gvisor.dev/ ) which seems to be a "docker++" approach to securing code (run in a container + replace all the linux system calls with Go stubs instead).

Comment From: ianlancetaylor

I don't think the reasons for dropping NaCl in general were either political or technical. My impression is that it was just business. NaCl was a good idea but it didn't catch on. There is a limit to how long a project will be funded when very few people are using it.

Go dropped NaCl because the upstream project was deprecated and maintaining it in Go was painful. #30439.

Comment From: glycerine

Thanks, Ian, for your insight.

Comment From: stephanwesten

@glycerine, you are right. I was reading on https://wasmcloud.com/ and the lack of multi thread support surprised me. Thought this was already implemented, and it was, but apparently some problems got in the way, and now there is a new proposal in the works. (Shared-everything). But indeed things move slow.

On the other hand, Cloudflare runs massive amounts of their workers (and they are really fast) without thread support. Perhaps it is not such an issue for them as each v8 get its own cpu core assigned (speculation). The problem is more on the developer who cannot use multi threading in his/her application. However you can call another worker, so to a certain extend you have some kind of parallelism. (No idea how hard to debug this would be in complex scenarios)

I do not know how the LLMs will play into this, for sure applications want to call them and still do something meaningful. Today there is also pressure to use multiple light weight models, not a big i-can-do-everything model because of cost and speed. So definitely a need for multi threading.

As opposite argument, in the end it is all about money, if a wasi cloud like architecture is much easier and cheaper to operate, like you do not need a team of 6 SREs but 2, who cares about a bunch of ugly workarounds…

Comment From: glycerine

Interesting: in https://github.com/tinygo-org/tinygo/pull/4385#issuecomment-2730454394 Varun Pail writes:

It's just that for wasm specifically, the gc compiler emits a branch for every single function call, including inside tight loops that never actually (need to) yield. This likely is the reason why the wasm size explodes out of control very fast (the size problem is way worse than other platforms) and performance is terrible due to the large branch tables (many of them in the standard library)

This is the first actual reasonable theory/explanation I've heard for why binaries are so large (those from big Go WASM binaries versus TinyGo).

Comment From: hajimehoshi

Wasm 3.0 Completed https://webassembly.org/news/2025-09-17-wasm-3.0/

Comment From: Malix-Labs

https://github.com/golang/go/issues/63904#issuecomment-2670831681 from @stephanwesten :

[CloudFlare Workers makes it so] each v8 get its own cpu core assigned (speculation)

This is a bit off-topic but CF Workers use isolated service workers all stemming from a single-threaded process that hosts thousands of V8 isolates. technically even the concept of threading is too abstracted away; and due to how V8 works, I'd be surprised if multithreading would ever be on the table

https://github.com/golang/go/issues/63904#issuecomment-3305388473 from @hajimehoshi :

Wasm 3.0

Also, quoting https://webassembly.org/news/2025-09-17-wasm-3.0/ :

  • Garbage collection. In addition to expanding the capabilities of raw linear memories, Wasm also adds support for a new (and separate) form of storage that is automatically managed by the Wasm runtime via a garbage collector. Staying true to the spirit of Wasm as a low-level language, Wasm GC is low-level as well: a compiler targeting Wasm can declare the memory layout of its runtime data structures in terms of struct and array types, plus unboxed tagged integers, whose allocation and lifetime is then handled by Wasm. But that’s it. Everything else, such as engineering suitable representations for source-language values, including implementation details like method tables, remains the responsibility of compilers targeting Wasm. There are no built-in object systems, nor closures or other higher-level constructs — which would inevitably be heavily biased towards specific languages. Instead, Wasm only provides the basic building blocks for representing such constructs and focuses purely on the memory management aspect.

Still blocking for Go

Comment From: Jorropo

« low-level » is not the wording I would use for a GC without inner-pointer support. ✨

Comment From: Jorropo

cc @glycerine we track this in #65440 🙂