Golang runtime: do map growth work on reads - Aurora Blog|java/go/python

CLs 37011 and 37012 suggest that finishing map growth can be important. If an author knows that a particular map is done growing and will henceforth be read-only, and they can spend some cycles optimizing it, it would be useful for them to be able to ask the runtime to finish any ongoing map growth.

This is kinda sort possible now:

for k, v := range m {
  m[k] = v
}

However, this is really slow, particularly for a large map. It does way more work than is necessary, and it does it very inefficiently.

The compiler could recognize this idiom and convert it into a runtime call.

Another option is to add API surface: runtime.OptimizeMap(m interface{}) or some such, which would be documented to be a slow, expensive, blocking call that makes subsequent reads more efficient.

Another idiomatic option, if copy worked on maps :) would be copy(m, m).

Other suggestions welcomed. I don't particularly like any of these, but it'd be nice to find something.

Comment From: randall77

I'd rather implement do-grow-work-on-read. It's tricky, as you have to use atomic operations and whatnot, but I think it is possible and it doesn't expose any new API.

Comment From: josharian

That sounds great.

Comment From: josharian

Changed from proposal to regular bug and retitled.

Comment From: cznic

I'd rather implement do-grow-work-on-read. It's tricky, as you have to use atomic operations and whatnot, ...

I falsely assumed years ago that this is the case. Since then, IIRC, in several places people were assured concurrent read operations on maps are safe. That may imply the expectation of no map mutations in such scenarios and that means no atomics/no lock with the good performance implications. If that's actually the status quo then I am slightly worried about losing this property, regardless of it had never been documented or guaranteed to such level of implementation details.

Comment From: ianlancetaylor

Concurrent read operations on maps do have to work. But that doesn't mean that we can't modify the map on read. It just means that we have to do it in ways that programs can not detect. Hence the reference to atomic operations and whatnot.

For example, we wouldn't do this because of the performance implications, but it would be perfectly fine to add a mutex to a map, acquire the mutex on every read, modify the map, and release the mutex when the read was complete. That would work for all existing programs except for the changes in performance.

Comment From: cznic

@ianlancetaylor I'm sorry that I was apparently not able to communicate clearly, but I'm now confused because what you wrote is precisely what I was trying to say, with emphasizes on "except for the changes in performance".

Comment From: ianlancetaylor

My apologies for misunderstanding. When you said "I am slightly worried about losing this property" I am not sure what property you are worried about losing.

Comment From: randall77

I'm pretty sure we can get the common case for grow-on-read down to a single additional atomic load (which on x86 at least is just a regular load). Because grow-on-read can be best-effort, even that atomic load may not be necessary. That might make the race detector unhappy, though.

Comment From: aclements

Issue #51410 has a good explanation and benchmark results showing how the current situation (not doing growth work on reads) can lead to confusing and significant performance cliffs.

Comment From: prattmic

With the new swissmap design (1.24), we no longer have a "growing" state that grows over several subsequent assignments. Instead, the map is split up into several tables, each of which grow independently, only when a key targets that table and it hits the max load factor.

Thus, I don't think this is relevant anymore.