@jakebailey's GopherCon talk mentioned the fact that string += string
in TypeScript is a constant-time operation; this is not the case in Go, and it is a mistake to build a string by invoking this operation in a loop, as it allocates a quadratic amount of memory.
We should define a modernizer that replaces such loops by uses of strings.Builder, which is efficient. For example:
var s = "initial"
for ... {
s += expr
}
use(s)
could be replaced by:
var s strings.Builder
s.WriteString("initial")
for ... {
s.WriteString(expr)
}
use(s.String())
The general approach should be to search for a += statement within a loop, whose LHS s is a local string variable; find s's types.Var
object; then use typeindex.{Def,Uses}
to iterate over the other uses of s to see if they conform to the pattern.
The safety conditions are:
- s must be a local variable. It may not be a global variable or some other expression such as expr.field that may have aliases.
- s must not be a parameter, since this would change the function's type. (One could introduce a new variable for the strings.Builder rather than turning s into a Builder, but I don't think it's worth it.)
- All but one uses of s must be of the form s += expr
, and must be within a loop.
- Exactly one (rvalue) use of s must be after the loop.
- The final use of s must not itself be within a loop. This subtle condition is needed because the transformation moves the expensive allocation step from the += operation to the use(s) operation. We must not degrade performance if the use(s) operation is executed much more often, as in this example when m is much larger than n:
for range n { s += expr }
for range m { use(s) }
Comment From: mvdan
Would the same apply to a long sequence of s += somestring
lines without a loop? I imagine it should. Note that other code may be interleaved, like:
var s string
s += "foo"
x, err := bar()
if err != nil {...}
s += x
s += "baz"
Comment From: adonovan
Would the same apply to a long sequence of
s += somestring
lines without a loop?
I don't think that case matters because the n in the O(n^2) term is both fixed at compile time and is typically small (since it's the number of lines of code). The problem only manifests when n reaches say 10,000.
Comment From: adonovan
That said, it would be worth allowing s += expr
before the loop as well. So let's amend it to:
- All but one uses of s must be of the form s += expr.
- At least one of those uses must be within a loop.
- The textually last use of s must be as an r-value, and there must be no intervening loops in the syntax tree ancestors between this use of s and the block in which s was declared.
Having now stated it this way, the second constraint could be trivially dropped as @mvdan suggested.
Comment From: gopherbot
Change https://go.dev/cl/700016 mentions this issue: gopls/internal/analysis/modernize: string+=string => strings.Builder