After CL 633075, this program:
package main
type T struct{ a, b, c int }
type TT struct{ i, j T }
func f(x *TT) {
x.j = T{}
x.i = T{}
}
Compiles to:
MOVD ZR, 24(R0)
STP (ZR, ZR), 32(R0)
MOVD ZR, (R0)
STP (ZR, ZR), 8(R0)
It would be better to compile to 3 STP
s. Because the pairing is chosen greedily we get stuck in a local maximum that isn't optimal. Not sure what to do about it yet.
(Note that if we assign i
and j
in order, we do get 3 STP
s.)
Comment From: zigo101
Similar to duplicate arguments in fmt print function calls.
Comment From: Dinichthys
Hello
I want to fix this issue. So, I did some research and as I found, the problem is in order of Store instructions in ssa. The first assignment x.j
is translated to 3 Stores to memory with higher addresses than Stores from the second assignment x.i
. So, stores are paired not in the optimal way, because last two unpaired instructions work with addresses with long distance. It can be solved by reordering stores, but are we allowed to reorder stores in this case? If no, please, can you give an advice of a solution?
Situation:
SSA
There is ssa representation of the function:
| x |
|_____i_____|_____j_____|
|___|___|___|___|___|___|
^ ^ ^ ^ ^ ^
MOVD --> ___(1)ST ---|---|---|---| | |
| (2)ST ---|---|---|-------| |
STP -->|___(3)ST ---|---|---|-----------|
MOVD --> ___(4)ST ---| | |
| (5)ST -------| |
STP -->|___(6)ST -----------|
Graph
To merge the STs we can build the graph in the following way: - Nodes are stores (the name of every node means the index of the store instruction) - Edge means that two stores could be paired
graph TD;
1<-->2;
1<-->6;
2<-->3;
4<-->5;
5<-->6;
As we can see, store (6) is paired with store (5), but the right way is to pair it with store (1). Another pair has similar situation. That happened because the compiler tried to pair the first two stores which could be paired, but this way isn't the best according to situation.
Also, we can use this graph to merge stores, going from one of the nodes with 1 edge and merging STs in their order in the graph.
Another possible solution is reordering STs instructions according to addresses in memory which they work with. It will help the compiler to pair it optimal.
Comment From: randall77
It can be solved by reordering stores, but are we allowed to reorder stores in this case?
Reordering stores is ok. The current code is effectively reordering stores.
*p = 0
*(p+200) = 0
*(p+8) = 0
It will pair the first and third stores, by effectively pushing the first store past the second store.
We have to ensure that we maintain correct semantics, so the stores need to be to different addresses, there can't be loads in between, etc. See https://github.com/golang/go/blob/d826bf4d7440ee07f865e5852638b5a2468687ff/src/cmd/compile/internal/ssa/pair.go#L273 for details.