Golang runtime: clear() is slow for maps with big capacity and small number of items

The issue

The following pattern is frequently used in order to avoid excess memory allocations by re-using the map:

func f() {
  m := make(map[string]int)

  for {
    addSomeItemsToMap(m)
    useMap(m)

    // clear the map for subsequent re-use
    clear(m)
  }
}

It has been appeared that clear(m) performance is proportional to the number of buckets in m. The number of buckets can grow significantly at addSomeItemsToMap(). After that the performance of clear(m) can slow down significantly (and forever), even if only a few items are added into the map on subsequent iterations.

See https://philpearl.github.io/post/map_clearing_and_size/ for more details.

The solution

Go runtime must be able to switch between the algorithm, which unconditionally clears all the buckets in m, and the algorithm, which clears only the buckets, which contain at least a single item, depending on the ratio between the number of items in the map and the number of buckets in it. This should improve performance of clear(m) in the pattern above when every iteration can store widely different number of items in m.

Comment From: gabyhelp

Related Issues

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

Comment From: randall77

This is another case where I think map shrinking (#20135) is probably the right solution. Or at least, it would help a lot, and maybe enough. And it helps lots of other cases.

Comment From: mknyszek

Wouldn't shrinking the map partially defeat the original optimization though?

Not saying we shouldn't shrink maps, it's just the particular example OP gave seems like a trade-off between clear performance and map insert performance (because it forces map growths). It may also be that just shrinking maps is better overall.

Comment From: mknyszek

As an additional note, I wonder if this is any better or worse with the Swiss map implementation in Go 1.24.

Comment From: thepudds

FWIW, Keith commented on a related issue regarding clear performance for large maps for the new Swiss map implementation in https://github.com/golang/go/issues/52157#issuecomment-2508211721.

Comment From: thepudds

Wouldn't shrinking the map partially defeat the original optimization though?

One approach could be that the backing storage would not shrink immediately. For example, there is this comment from Keith in https://github.com/golang/go/issues/54454#issuecomment-1216112699, including:

We probably don't want to shrink if people do the map clear idiom + reinsert a new set of stuff. Perhaps we only start shrinking after 2N operations, so that N delete + N add involves no shrinking. There's definitely a tradeoff here between responsiveness when you use fewer entries and extra work that needs to be done to regrow.

(That is an older comment that predates the clear builtin I think, but presumably something similar could apply).

Comment From: prattmic

In https://go.dev/cl/627156, I made swissmap iteration make use of the map metadata to skip runs of empty slots, which significantly speeds up iteration over large but sparse maps (-38% vs old maps).

The same idea could be applied to clear to address this issue. clear will still take longer on large maps, but the increase should be less extreme.

Comment From: prattmic

We also don’t need to clear the slots at all unless they contain pointers (as delete already avoids)

Comment From: mvdan

I am experiencing huge slowness with clear as well: https://github.com/cue-lang/cue/issues/3981#issuecomment-3003865959

The numbers below are all with go version go1.25-devel_b5d555991a 2025-06-25 13:56:42 -0700 linux/amd64 and GOAMD64=v3.

The code in question keeps clearing and re-using a map to avoid allocating capacity for it over and over. This is typically a clear win for slices, and I assumed clearing maps was cheap enough. However, a user reported that entire minutes of CPU time were being spent in a clear call.

First, I noticed that the naive approach to create new maps each time gives a significant speed-up, although of course, it causes memory usage to spike too much:

           │ clear-swiss │            make-swiss            │
           │   sec/op    │   sec/op    vs base              │
VetCaascad    4.443 ± 1%   4.164 ± 2%  -6.27% (p=0.000 n=8)

           │ clear-swiss  │             make-swiss              │
           │     B/op     │     B/op      vs base               │
VetCaascad   3.318Gi ± 0%   4.170Gi ± 0%  +25.69% (p=0.000 n=8)

           │ clear-swiss │             make-swiss             │
           │  allocs/op  │  allocs/op   vs base               │
VetCaascad   12.85M ± 0%   14.50M ± 0%  +12.84% (p=0.000 n=8)

The more interesting data point is that building the Go program with the old maps via GOEXPERIMENT=noswissmap causes the clear approach to be really fast again, but this time, at almost no difference in allocations:

           │ clear-swiss │          clear-classic           │
           │   sec/op    │   sec/op    vs base              │
VetCaascad    4.443 ± 1%   4.169 ± 1%  -6.16% (p=0.000 n=8)

           │ clear-swiss  │           clear-classic            │
           │     B/op     │     B/op      vs base              │
VetCaascad   3.318Gi ± 0%   3.327Gi ± 0%  +0.29% (p=0.000 n=8)

           │ clear-swiss │           clear-classic           │
           │  allocs/op  │  allocs/op   vs base              │
VetCaascad   12.85M ± 0%   13.12M ± 0%  +2.12% (p=0.000 n=8)

So it seems to me like, with this program and map usage pattern, clear is orders of magnitude slower with the new swiss maps versus the classic maps. Below are steps to reproduce this, if that is helpful:

The repo is https://github.com/cue-lang/cue at bb2494bb0d116078583882c0f097c10d1d00f239; build with go install ./cmd/cue
The patch to replace clear with make is https://review.gerrithub.io/c/cue-lang/cue/+/1217465
To run the code, clone https://github.com/mvdan/caascad-unity-tests and run cue vet -c=false . in the cloned directory.
To obtain the benchmark numbers above, I used CUE_BENCH=VetCaascad cue vet -c=false . in a shell loop; it produces benchstat-compatible output.

Note that this repository only shows a 6% difference in cpu/wall time, or about 300ms, but another user in that thread, Joel, is seeing over seven minutes being spent in the same clear call. Unfortunately his CUE repository is private so I cannot share such a benchmark right away. However, I suspect it's a matter of his visited map growing much larger than mine.

Comment From: mvdan

I'd also be interested to hear if you have any particular suggestions we could try in the near term. Even if this bug is fixed for e.g. Go 1.26, I'd rather not leave users hanging with bad performance for months.

One approach is to make new maps, but as shown above, the memory usage increase is too much. Joel reports his peak memory usage jumps to 32GiB.

Another approach is to revert our release builds to GOEXPERIMENT=noswissmap, and recommend that users building from source do the same, but that's rather clunky and unfortunate.

I haven't tried a hybrid approach to only keep and reuse maps as long as their capacity isn't too large. However I'm not even sure that kind of logic is possible, as cap on maps does not work.

Comment From: prattmic

I'd also be interested to hear if you have any particular suggestions we could try in the near term.

The most straightforward suggestion I have is to iterate over the map and delete all the entries (what you would have done before clear supported maps). Per https://github.com/golang/go/issues/70617#issuecomment-2515012682, iteration already has the large map optimization that we should apply to clear as well.

Comment From: randall77

https://go-review.googlesource.com/c/go/+/633076 already has the code that skips zeroing groups that have no set entries. That code did not make 1.24 but it should be in 1.25. Does that help any?

Comment From: mvdan

The numbers I shared above were with Go at tip.

Comment From: randall77

Right, I guess I'm asking whether tip is any better than 1.24. If not, then maybe there is another effect going on.