Golang cmd/compile: devirtualize singly-typed interface return values

Common convention among a lot of Go programmers is to not return interfaces, because this always forces a pointer to escape, and thus triggers a heap allocation if the pointer didn't come from an argument. In fact, there's a linter to stop people from hitting themselves with this: https://github.com/butuzov/ireturn.

It also means that if the function is not inlined, Go will not devirtualize interface calls. For example:

package x

type i interface { f() }

type k int

func (*k) f() {}

//go:noinline
func x() i {
    return new(k)
}

func y() {
    x().f()
}

Go emits the following:

        TEXT    .y(SB), ABIInternal, $16-0
        CMPQ    SP, 16(R14)
        PCDATA  $0, $-2
        JLS     ...
        PUSHQ   BP
        MOVQ    SP, BP
        SUBQ    $8, SP
        CALL    .x(SB)
        MOVQ    24(AX), AX
        MOVQ    AX, CX
        MOVQ    BX, AX
        CALL    CX
        ADDQ    $8, SP
        POPQ    BP
        RET

However, by inspection, we can tell that CALL CX will always call (*k).f. Go simply does not pipe the information needed to devirtualize this call.

There is a relatively simple optimization opportunity here. There are two cases of interest: 1. A function that returns an interface, where, within that function, it is possible to devirtualize every return statement's argument for that return value into the same concrete type. 2. The above, but nil is also possible.

We could rewrite the above example as follows:

func x-devirt() *k {
    return new(k)
}

func x() i { // Kept only for converting to a function pointer.
  return x-devirt()
}

func y() {
    x-devirt().f()
}

This is essentially the optimization realized by Rust's -> impl Trait syntax, which requires that the function return precisely one concrete type, as if the return value was not a trait, but callers only see the trait. The version I suggest doesn't require changing language semantics, but probably opens up the usual optimization opportunities you get out of devirtualization.

Ideally the devirtualized return type would be advertised to callers directly to aid in devirtualization of their own return values, e.g. scribbled somewhere in ir.Func.

This can be easily extended to functions that return either a concrete value or nil by returning a bool for indicating whether nil was returned or not. Given that this is a relatively common case, it feels worthwhile to try to address it, too.

Of course, once the optimization actually works it may be worthwhile to investigate places in the standard library where multiple unexported types are returned that could return just one, and take advantage of this optimization. For example, fmt.Errorf returns three different concrete types, but could really get away with one.

Note that this is necessary but not sufficient to eliminate the allocation penalty of x above. A separate optimization that changes the ABI to require the caller to pass in their own memory, possibly on the stack for a non-escaping return value, for returning a value by pointer would be necessary. This probably also requires making escape analysis track more information. (The C++ ABI on x86_64-unknown-linux does something like this for returning large types by value; the caller allocates space for them, rather than returning them on the stack.)

Comment From: gabyhelp

Related Issues

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

Comment From: thepudds

Hi @mcy, FWIW, I separately have a WIP CL that I think does most of this work. There is a high-level description related to your example I think in item 3 in https://github.com/golang/go/issues/72036#issuecomment-2691658303.

(The main work in #72036 overall is not directly related to your suggestion, and it would be possible to solve a single type being returned without the rest of #72036, but the machinery there enables I think a slightly more powerful version of your suggestion, which handles more than one type being returned. I concluded resolving your example using the machinery in #72036 likely required I think a modest update the export format to do efficiently, which I haven't tackled yet, but there's also a chance that won't be required. In any event, I plan to attempt to get #72036 across the finish line for the Go 1.26 dev cycle).

In any event, it's not identical, and this is probably not the most useful response to make here. 😅

Comment From: mcy

@thepudds hmm... your suggested optimization doesn't address the motivation I have, which is to avoid needing to hit the stack for certain kinds of interfaces. The reason the single interface case interests me is that it becomes possible to lift e.g. interface implementations backed by slices and other non-pointer-shaped values into registers on the way out of the function.

However the latter part of my post seems closely related: when a function wants to return something like &unexported{}, and it does not otherwise escape, the caller should be on the hook for providing the memory to return it in, passed via a hidden pointer argument. These optimizations together would eliminate a large class of unnecessary heap escapes.

I filed an issue a while ago (https://github.com/golang/go/issues/73589) which suggests taking this further and promoting values whose addresses do not escape to registers, and the dream would be to be able to lift a function that returns a slice wrapped on an interface all the way into returning all three slice words in registers.