Golang cmd/compile: closure func name changes when inlining

% cat x.go
package main

func f(x int) func() {
    return func() {
        panic(x)
    }
}

func main() {
    f(1)()
}
% go run -gcflags=-l x.go
panic: 1

goroutine 1 [running]:
main.f.func1()
    /private/tmp/x.go:5 +0x26
main.main()
    /private/tmp/x.go:10 +0x22
exit status 2
% go run x.go
panic: 1

goroutine 1 [running]:
main.main.func1(...)
    /private/tmp/x.go:5
main.main()
    /private/tmp/x.go:10 +0x28
exit status 2
%

Without inlining, the stack shows main.f.func1 panicking, which is approximately correct: it's the 1st closure inside main.f.

With inlining, the stack instead shows main.main.func1, because now it's the 1st closure inside main.main due to inlining of f. However, that's a much less useful name, since the closure is still lexically inside main.f. If there is some way to preserve the old name even when inlining, that would be a good idea.

This is the source of a Kubernetes test failure using Go 1.21, because a more complicated function was not being inlined in Go 1.20 but is being inlined in Go 1.21 after CL 492017. The test expected to see "newHandler" on the stack when a closure lexically inside newHandler panicked.

If there's something we can do here, we probably should, both to keep the closure names helpful and to keep tests like the Kubernetes one passing.

Comment From: thediveo

not only k8s, but for instance leaky go routine tests may fail b/c whitelisting doesn't work correctly anymore.

Comment From: thanm

Hi Austin, since you are the original author of CL 479095, assigning this bug to you. If you would prefer that I work on this particular cleanup, please assign back to me. Thanks, Than.

Comment From: aarzilli

This was noticed before: #55980

Comment From: mdempsky

If there is some way to preserve the old name even when inlining, that would be a good idea.

Is it okay if we just arrange that the runtime.Func for the inlined closure reports the original closure's link symbol name, instead of the duplicate's own name?

Comment From: rsc

I don't think I understand the suggestion, but we want the inlining-disabled name to appear in profiles, stack traces, and symbol tables. I don't think changing just runtime.Func will do that.

Comment From: rsc

For what it's worth, we have the name of the lexically enclosing function at the point of the closure creation, because if the program crashed at that PC we would show that frame in the stack trace marked [inlined].

Comment From: mdempsky

Yes, it's easy to track the original closure symbol name. That's not a problem.

What I'm trying to understand is what you want done with that.

We currently duplicate the closure and underlying function text, because they can be optimized differently due to different escape analysis and variable capture. So the duplicated closure needs its own linker symbol.

Comment From: rsc

Does it? I thought the linker addressed all the symbols by table index now, so if we made it a symbol internal to the package being compiled, it would not collide with any others with the same name.

Comment From: mdempsky

What I'm trying to understand is what you want done with that.

Comment From: rsc

I am interpreting the previous comment as my not having answered your questions, so I will elaborate a bit here.

The function names reported in profiles, stack traces, symbol tables, disassembly, and so on should not depend on how aggressively we inline. That is, the names that we get with no inlining at all are the "official" names, and inlining should preserve them.

I understand that, in terms of the example at the top, if we do:

func main() {
    f(1)()
    f(2)()
    f(3)()
}

that there will be three copies of the actual text symbol generated when f is inlined, and that the texts may actually differ across the 3 calls. But I think the new linker's ability to address symbols by table index instead of name should mean that it's completely fine to have those three text symbols all use the same name: the original name they'd have used without inlining.

Comment From: mdempsky

@cherrymui How should the compiler create multiple obj.LSyms that are kept separate but all end up with the same linkname in the symbol tables?

Comment From: rsc

As background, we originally named closures opaque names like 'func•1', but that was extremely unhelpful in all the places I mentioned where function names appear, not to mention in all tools that didn't support UTF-8. Issue #8291 was to give the closures more useful names, which Dmitri did in CL 3964. Inlining of functions containing closure creation is essentially a regression of that issue. It may seem like they just need a name, any name, but that's not true. The names carry useful information that has been lost since ~~unified IR~~ the compiler started inlining these kinds of functions. I'm not just being picky.

Comment From: mdempsky

The names carry useful information that has been lost since unified IR started inlining these kinds of functions.

Please retract that accusation. The stack trace for your test program is the same in Go 1.19 as in Go 1.20: https://go.dev/play/p/yjP9hrxZ_K0?v=goprev

Comment From: rsc

It was not an accusation, but I apologize nonetheless. My memory was that one of unified IR's important inlining advances was to enable inlining of closure-containing functions. I misremembered - we started inlining those in Go 1.17, so obviously that must have been with the original IR. In any event, as unified IR has made it possible to do more aggressive inlining, we lose more and more of this information. This is not a criticism of unified IR. My point is only to explain the context of how we got here. All the changes individually make sense, but this is a rough edge where they aren't quite working well enough together.

Comment From: mdempsky

I apologize nonetheless.

Thank you.

but this is a rough edge where they aren't quite working well enough together.

Ack. That's what I'm trying to understand: how should this work?

As I've said already, we need multiple linker symbols (i.e., obj.LSyms), because in general the inlined closure can get optimized differently than the original closure.

You suggest the multiple obj.LSyms should just have identical names in the symbol tables, etc (i.e., use the same linknames). That's not something we do anywhere else in the compiler to my knowledge, so I don't know how to invoke cmd/internal/obj to make that happen. Hence why I asked @cherrymui how to do that.

However, I remember she's on vacation, so if any other linker experts can advise on this, that would be great. @golang/compiler

Comment From: rsc

I think it would be good enough for Go 1.21 if we embed the inlining stack in the name (eliding package names), so for example if F inlines f inlines g inlines h creates a closure, the function would be named F.f.g.h.func1. As long as the same counter is used for all the funcs, it won't matter if that sequences appears again: it would get a different trailing number.

So if you had

func F() {
    f(1)()  // F inlines f inlines g inlines h creates a closure
    g(2)() // F inlines g inlines h creates a closure
    h(3)() // F inlines h creates a closure
    f(4)() // F inlines f inlines g inlines h creates a closure
}

The four text symbols created in main would be named F.f.g.h.func1, F.g.h.func2, F.h.func3, and F.f.g.h.func4. That's not ideal, since in general the package name might be important, but it should be enough breadcrumbs to help people and tools and tests. And then if we find out how to do the duplicate text symbol thing in the linker, we can adjust for Go 1.22.

Comment From: mdempsky

I think it would be good enough for Go 1.21 if we embed the inlining stack in the name (eliding package names)

Thanks, that seems doable for 1.21.

Comment From: prattmic

I agree that we should improve the names of inlined closures and the suggestion in https://github.com/golang/go/issues/60324#issuecomment-1557684028 seems like a reasonable short-term compromise.

If that is easy to do at the end of the cycle, great, but I'd push back a bit on this being a release-blocking issue that needs an immediate fix right before the freeze. We know that this is an existing issue that gets more annoying as we are more aggressive with inlining closures, but are we actually that much aggressive than before?

CL 479095 states that it increases inlinability by +0.6%. It doesn't say how much it increases inlinability of closures specifically, which is probably the more important metric. But at first glance, it looks like we got unlucky with one specific test rather than us getting sharply more aggressive and causing widespread issues.

Comment From: gopherbot

Change https://go.dev/cl/497137 mentions this issue: cmd/compile: incorporate inlined function names into closure naming

Comment From: mdempsky

Russ's suggested naming change ("main.main.func1" -> "main.main.f.func1") has been implemented and submitted for Go 1.21.

I'm happy to implement further changes for Go 1.22, pending advice from the linker devs on how to best approach that.

Comment From: thanm

Thanks Matthew.

Comment From: aclements

Do we want the symbols to have the non-unique names? There's a tension: we go to some effort to make runtime tracebacks reflect the language, not the implementation, which suggests those should show the "original" name of the closure, even if that's non-unique; but at the same time developers sometimes need to map that to an ELF/etc symbol, in which case you really want them to be unique (the PC offset traceback prints doesn't do you much good without a unique symbol name!).

We could change how we show closures and/or inlining in runtime tracebacks. Is there just a nicer way to present this information? I don't have any answers, but maybe thinking about that bigger picture will inspire someone. 🙂

Comment From: mdempsky

One possible solution would be to ensure the closure is never optimized differently when inlined, so we can just always reuse the same underlying function all the time.

With how things are currently implemented, it's certainly possible that the closure gets optimized differently. But I don't have an example off hand to demonstrate that we do.

Comment From: rsc

Thanks for the CL @mdempsky.

I've confirmed that the fix changes the stack frame in question in the failing Kubernetes test from k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.TestTimeout.func5 to being named k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.TestTimeout.newHandler.func5 which makes the test (which wanted to see 'newHandler') pass again.

I've changed this to the Go 1.22 milestone, and I agree with @austin that there is a question about whether we should stop at these not-unique names.

It was a release blocker for Go 1.21 because there was no way to test for the presence of a function on a call stack (in, for example, a test of panic behavior) that worked reliably across changes to the amount of inlining. That was breaking real tests with no obvious answer for how to rewrite them to be more robust. Now we have the guarantee that the lexically enclosing function name is preserved. Yes, it was not a new bug in Go 1.21 (as we established, it first started in Go 1.17), but Go 1.21 changes made the problem worse. I appreciate the fix.

Comment From: cherrymui

@cherrymui How should the compiler create multiple obj.LSyms that are kept separate but all end up with the same linkname in the symbol tables?

Do you want multiple symbols with the same name but could have potentially different content? Or the content is the same? (I guess this depends on whether inlined closures are always compiled the same -- my feeling is that it could be different.)

If the content is all the same, you can just mark the symbol DUPOK, and there will be just one copy in the final binary.

If you want different content, I think one possibility is to mark the symbol Static. Static symbols are package local, so the linker won't try to mix or dedup them. I'm not really sure, but I think it may work even if there are multiple such symbols in the same package (e.g. F has a closure, F is inlined into both G1 and G2, which are in the same package), because Static symbols are referenced by index, not name (at least, from the perspective of object file and the linker; compiler's pkg.Lookup may be unhappy so you may need to create such symbols manually).

Comment From: mdempsky

Do you want multiple symbols with the same name but could have potentially different content?

Potentially different content, yes.

If you want different content, I think one possibility is to mark the symbol Static.

Thanks, I'll experiment with that.

(Caveat to both of these that I think in practice the content is actually always the same, or at least very nearly always the same, and probably the better solution here is to just ensure that they are in fact always the same and then reuse a single symbol.)

Comment From: mdempsky

I'd like to suggest a revision to how we rename inlined closures. In particular, here are my thoughts:

Users care most about where the closure came from in the source, not as it was optimized. So we should prioritize the original (i.e., pre-inlined) closure name.
I don't think users particularly care about how a closure ended up cloned into a particular function (but I could be wrong).

In the implemented naming scheme's example, the original closure is name pkg.h.func1, but then it gets renamed to (e.g.) pkg.F.f.g.h.func4, which isn't intuitive for users to decode. It also makes it very difficult for tools to correlate these symbols. In particular, note that if pkg.h is inlined into pkg2.G, then it gets named pkg2.G.h.funcN without any mention of pkg.

I suggest instead that we name the inlined closures pkg.h.func1@F.1, pkg.h.func1@F.2, pkg.h.func1@F.3, and pkg.h.func1@F.4.

If we want to preserve the transitive inlining breadcrumbs, I suggest we name them like pkg.h.func1@F.f.g.1.

If the closures were inlined across package boundaries, then we'll name them pkg.h.func1@pkg2.G.1 instead, to ensure unique symbols.

Comment From: aarzilli

IIRC @ has a special meaning in symbol names.

Comment From: dr2chase

If we could find a less special special character, I think this would be a nice improvement to the status quo.

Comment From: mdempsky

@aarzilli Thanks, @cherrymui reminded me of that privately. I'm happy for an alternative character if anyone has suggestions. ("foo@bar" just seemed the instinctual suggestion for suggesting that foo was inlined at bar.)

Comment From: aarzilli

My personal opinion is that (aside from the @ issue) the inlining breadcrumbs don't carry enough value to justify the noise they introduce. I think this suggestion is the better one:

I suggest instead that we name the inlined closures pkg.h.func1@F.1, pkg.h.func1@F.2, pkg.h.func1@F.3, and pkg.h.func1@F.4.

Comment From: mdempsky

A few other suggestions I want to make, as long as we're discussing it:

I'd like to adopt the naming that x/tools/go/ssa uses instead. That is, instead of p.F.func1, p.F.func2.3, etc, the anonymous functions are p.F$1, p.F$2$3, etc. This is shorter, uniform, easier to recognize as anonymous functions at a glance (IMO), and also easier to reason about whether the resulting mangled names remain collision free. Note: A consequence of go.dev/cl/522318, is that function literals that appear in package-scope variable initialization expressions have had their name change from p.glob..func1 to p.init.func1. So these would now become p.init$1.
When a function literal is assigned to a variable (e.g., helper := func() { ... } or _, helper, _ = ..., func() { ... }, ...), then we instead name it p.F$helper. If it's necessary to disambiguate function literals given the same name this way, they'll be disambiguated p.F$helper#1, p.F$helper#2, etc. (This is inspired by x/tools/go/ssa naming explicit init functions p.init#1, p.init#2, etc, instead of p.init.1, p.init.2, etc.)
I'm leaning towards naming the inlined anonymous functions like p.F$1[X], p.F$2$3[X], p.F$helper[X], etc., where [X] exists to disambiguate the anonymous function across inlined call sites. ~Note: If anonymous function p.F$2 contains p.F$2$3 and is inlined at X, the functions will be named p.F$2[X] and p.F$2[X]$3, respectively. If p.F$2[X]$3 (or logically equivalently just p.F$2$3) is inlined at Y, it will be named simply p.F$2$3[Y]. There's no need for multiple disambiguators (e.g., p.F$2[X]$3[Y]).~ Edit: Actually, I think the inlining disambiguator can and should just always go at the end. There's no need for users to worry about both p.F$2[X]$3 and p.F$2$3[Y]. It's not much extra trouble for the compiler to ensure X and Y are disjoint sets of strings.

The exact numbers used (i.e., for p.F$1 or p.F$helper#1) or the inlining disambiguators (i.e., for p.F$1[X], including whether they're used at all) continue to be explicitly unspecified. I could be convinced though to guarantee the numbers correspond to source order (i.e., stable under optimizations like deadcode elimination).

Looking through Google's internal Go code base, there's some code that looks for the current \.func\d+ patterns, but it seems not a lot. I'd be happy to put the new naming behind a GOEXPERIMENT and shipping it disabled for a release to ease migration.

Note: gccgo already uses a completely different naming scheme for anonymous functions.

Comment From: thediveo

not exactly thrilled when closure func names change (again), because in the Gomega go routine leak checker (https://github.com/onsi/gomega/blob/55a33f3ff7b97c9f999d641bb6f92621d1c2edcf/gleak/have_leaked_matcher.go#L44) we need to maintain an ignore list of known background go routines from stdlib and testing. I think that we luckily have avoided hardcoding closure funcs using ... in the ignores. However, changes discussed here always keep me wary...

Comment From: aclements

I'm leaning towards naming the inlined anonymous functions like p.F$1[X], p.F$2$3[X], p.F$helper[X], etc., where [X] exists to disambiguate the anonymous function across inlined call sites.

@mdempsky , just to clarify, the p.F$1 part is from the non-inlined name of the closure, and the X part serves to disambiguate inlining site?

I always worry about what assumptions linkers and tools make about what characters are allowed in symbol names. :sweat_smile:

Gomega go routine leak checker

@thediveo , thanks for pointing that out. If we implement @mdempsky's ideas, it looks like the one way Gomega's matcher will break is that ... matches "a . followed by anything" and @mdempsky's proposed naming no longer includes a . after the base function/method name.

Comment From: rsc

I'm concerned about the user experience for non-expert Go programmers.

This will add churn to the overall user experience, as every change here does. Do we have evidence that the changes made for Go 1.21 aren't good enough? It seems like a change here is only worthwhile if it is obviously dramatically better or it addresses something in the status quo that is demonstrably not good enough. I don't see either one of those here.

The go/ssa names seem strictly worse than what we have today. From the point of view of someone not familiar with the ins and outs of the compiler, it seems guessable that F.func1 is the first func in F. In contrast, F$1 seems like cryptic line noise (or Perl syntax) to me.

If the func is assigned to a unique variable g, then F.g seems very clear too. I don't see any reason to introduce a new delimiter $ like in F$g. I also don't think I understand what [X] means but p.F$2$3[X] seems like a complete non-starter to me. It seems even more like cryptic line noise (or Perl syntax) than F$1.

The transitive inlining "breadcrumbs" within a package seem important to me to preserve as well. They're much more readable than an arbitrary numbering.

The choice of delimiter matters too. If p.F.g is inlined into H and from there inlined into J and we want a symbol of the form p.F.g(delim)H(delim)J, then we need a delimiter that suggests "in". @ is pretty good for that: p.F.g at the use in H at the use in J. Are we sure that C toolchains will break if we use @ here? The symbols would not be exported in any way, so I would hope a C linker would leave them alone.

If @ is off the table, then > might be good, as it suggests injection; p.F.g>H>K.

Comment From: thediveo

line noise ... then it most probably is a valid TECO command.
would exposing the names of local variables be a problem? what in case the func is returned immediately, so there's no name to associate with?

Comment From: aarzilli

One thing to keep in mind is that, I think, the [X] breaks any parser for symbol names that has been updated to work with generics, including debug/gosym.

This will add churn to the overall user experience, as every change here does. Do we have evidence that the changes made for Go 1.21 aren't good enough?

One thing is that the .funcN convention has always been ambiguous (is pkg.A.func1 a closure in function pkg.A or is it a method of type pkg.A?).

It would be nice if a delimiter other than . was used here but this is also at odds with backwards compatibility.

Another thing is that the inlining breadcrumbs aren't all that user friendly either, I think. If I didn't know about this thread pkg.A.B.func3 would make me look at pkg.A when I should really be looking at B (possibly in a different package?).

Comment From: thediveo

unless you name all your packages pkg, A, B, etc., it usually becomes quite obvious where the package in the name is, more so as it is an import path with github.com, or some other TLD.

Comment From: ianlancetaylor

Are we sure that C toolchains will break if we use @ here?

In an ELF object file, the @ character is used to separate a symbol name from the symbol version. So, yes, you're going to get odd behavior if you pass an object file with symbol names containing '@' characters to an ELF linker.

Comment From: thediveo

Do you think renaming a.F.func1 to b.G.F.func2 is "good enough" for Gomega? Do you think changing it to a.F.func1+suffix would be "obviously dramatically better"?

ad 1. should work as we would add the new pattern next to the existing one, both coexisting without interference.

ad 2. I lived halfway happily with the existing scheme but hadn't an urge for longer symbol names.

Comment From: rsc

@ianlancetaylor

In an ELF object file, the @ character is used to separate a symbol name from the symbol version. So, yes, you're going to get odd behavior if you pass an object file with symbol names containing '@' characters to an ELF linker.

I understand that, but I also noted "The symbols would not be exported in any way, so I would hope a C linker would leave them alone." That is, I think that the vast majority of our symbols are or can be generated as STB_LOCAL, and in particular any closure functions can definitely be set STB_LOCAL. If we use the @ syntax in STB_LOCAL names, will that still affect ELF linkers?

Comment From: ianlancetaylor

Technically you're right that an ELF linker shouldn't care about @ in a local symbol name. But it seems like a bad idea to count on such symbols being handled cleanly by the various ELF linkers and by the various ELF tools in general. I would only try that if it were fairly important to use @. None of the @ support is standardized or seriously documented (Sun introduced symbol versioning, and they documented it well, but the @ support is a GNU extension that lets people skip writing actual version definition files).

If you really want the @ then an option would be to only use the @ in the debug info, and use a mangled name in the ELF symbol table.

Comment From: rsc

Sounds good, thanks for the extra details.

Comment From: rsc

One thing is that the .funcN convention has always been ambiguous (is pkg.A.func1 a closure in function pkg.A or is it a method of type pkg.A?).

That's not ambiguity, because pkg.A can only be one of those two things. We don't say that pkg.A itself is ambiguous because it might be a function or it might be a type. We assume it's okay to need to know what kind of thing A is. And if we know what kind of thing A is, then we know what kind of thing A.func1 is.

Another thing is that the inlining breadcrumbs aren't all that user friendly either, I think. If I didn't know about this thread pkg.A.B.func3 would make me look at pkg.A when I should really be looking at B (possibly in a different package?).

I agree with Matthew's observation that it would be more helpful to reverse the breadcrumbs. My point was that if we have F inlined into G inlined into H as well as F inlined into J inlined into K, it is much more helpful to name them with some representation that includes (F,G,H) and (F,J,K) than it is to name them F$1 and F$2.

Comment From: aarzilli

That's not ambiguity, because pkg.A can only be one of those two things. We don't say that pkg.A itself is ambiguous because it might be a function or it might be a type. We assume it's okay to need to know what kind of thing A is. And if we know what kind of thing A is, then we know what kind of thing A.func1 is.

Ok, but for example debug/gosym doesn't know that and gets them wrong.

Comment From: rsc

@aarzilli Do you mean just debug/gosym.Sym.BaseName, or other code in debug/gosym too? What do you use BaseName for?

Comment From: aclements

If we were to make the inlining breadcrumbs unambiguously parseable from the symbol name, then in a traceback, we could print only the "original" name by default, and include the full name if and only if we're showing full PCs in the traceback (I'm thinking as additional information at the end of the line, not replacing the original name). That seems to be the exact case where the actual symbol name may matter. Perhaps we suppress the PC offset if we're only printing the original name to avoid confusion. This is a more refined version of the point I made in https://github.com/golang/go/issues/60324#issuecomment-1559843162.

Comment From: aarzilli

@aarzilli Do you mean just debug/gosym.Sym.BaseName, or other code in debug/gosym too? What do you use BaseName for?

BaseName and ReceiverName. Delve doesn't use debug/gosym, exactly, but it has its own equivalent reimplementation. It uses it to find functions with a partial name match (for example to set a breakpoint). It has never needed to work correctly with closures, so far, but it would be nice to know that it could.

Looking around with sourcegraph I can't find many users of debug/gosym, but for example GoRE could be affected by this problem as well. Hard to say if there are other half-reimplementations floating around that are affected.

Personally, as a user of go, I agree that $ is too noisy as a delimiter but , or - are also unambiguous and not too noisy. I don't see the value of the inlining breadcrumbs at all.

Comment From: aclements

It occurs to me that this case of inlining is just one example of function multicompilation/multiversioning. In this case, we're compiling multiple versions of the function because they may be optimized differently in different inlining contexts. But this is a technique we've considered for other things, like multiversioning for escape analysis and for access to CPU features that are detected at runtime.

In fact, there's another case where we do this already: GC shape stenciling. We had to disambiguate the symbols in that case, too. But in that case we decided to hide the disambiguator in tracebacks and only show the "user" name. I think that was the right choice (though we still show PCs, which might be a mistake 😅). In that case, the disambiguator was pretty meaningless to users, and I could see an argument that the inlining breadcrumbs are more meaningful, but I'm not sure I buy that these cases are so different.

Comment From: mitar

It occurs to me that this case of inlining is just one example of function multicompilation/multiversioning.

Aren't generics the same? So maybe it would be neat to include in stack traces also the concrete type for which a generic function got compiled?

Comment From: aclements

Aren't generics the same?

Sorry, that's exactly what I meant by "GC shape stenciling" (I can see how that wouldn't be obvious!)

The thing is, the symbol name doesn't include the concrete type for which a generic function got instantiated. What it includes is much more abstract and not particularly helpful to the user.

Comment From: bric3

Hi just to weigh in on this comment by @mdempsky https://github.com/golang/go/issues/60324#issuecomment-1686914585.

It also makes it very difficult for tools to correlate these symbols

I agree to the whole comment because I'm writing such tool within the IDE, and it is sometimes not possible to correctly resolve the symbol there. Stack traces in particular have the file name, but this might be transformed (trimmed, or else, e.g. Bazel does that with the default ruleset).

Having a marker to identify inlines so it's possible to handle them (maybe skip them) could help.

Comment From: aarzilli

As inlining gets better and better, closure names get worse and worse. One of the closure names generated for https://github.com/go-delve/delve/blob/master/_fixtures/rangeoverfunc.go looks like this:

main.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.OfSliceIndex[go.shape.int,go.shape.[]int].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1-range5.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.OfSliceIndex[go.shape.int,go.shape.[]int].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1-range5.OfSliceIndex[go.shape.int,go.shape.[]int].func3.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1.OfSliceIndex[go.shape.int,go.shape.[]int].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[go.shape.int,go.shape.[]int].func1.TestMultiCont0-range1-range5-range8.OfSliceIndex[go.shape.int,go.shape.[]int].func4

The full string does not show up in stacktraces, but only because the symbol name parser in Go doesn't understand it either, thinks everything between the first [ and the last ] are generic paremeters and elides it. This makes the inlining breadcrumb somewhat less hideous but also even more useless.

I've given a couple of tries to just giving them duplicate symbol names but without success.

Comment From: aclements

That symbol name is remarkably unpleasant!

For reference, it comes from this function, which has four levels of range-over-func loops. Each of these is an OfSliceIndex iterator, which is itself quite simple, other than having two type parameters.

Even ignoring all of the type parameters in the symbol name (several of which could be different, but aren't in this particular case), our current naming scheme clearly scales super-linearly here, which is rather concerning:

main.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1-range5.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1-range5.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func3.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func2.TestMultiCont0.TestMultiCont0.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func1.TestMultiCont0-range1-range5-range8.OfSliceIndex[
   go.shape.int,go.shape.[]int
].func4

Comment From: thediveo

I wonder if James Joyce would have been able to pack a novel into these symbols using carefully crafted closures......?

Comment From: gopherbot

Change https://go.dev/cl/639515 mentions this issue: cmd/compile,cmd/link: better names for closures

Comment From: cherrymui

Thanks @aarzilli for the CL! It originally appeared to be a good idea to me. But as I read and think more, I'm less sure about it, given the complexity of the CL. I think the fundamental issue is that we need a name that is user friendly, and we need a name for symbol resolution. At least for closures, I think it is still achievable to use one name for both.

Reading the comment above, it is clear that for the closure name, the source-level function where the closure is defined is important. This is the name that users want to see. The physical function that contains the closure after inlining seems not that important, and we could omit it (if we can make symbol resolution work). So we probably want to name the closures along this line.

We need some mechanism to make symbol resolution work, when the closure containing function is inlined into multiple places. Currently we rely on naming it uniquely so different inlined versions of the closure get different names. But this is where the clutter comes from. Given that we now use indices for symbol resolution in the linker, I think this doesn't really require unique names. Not all symbols are solved by indices. But for closures, I think it is 100% index-resolvable. After inlining, closures can only be directly referenced in the containing package (it can be passed to other packages as a value, but that is not a direct symbol reference), and indices are sufficient for this case. It might be an issue for external linking if the closure symbols have the same name, but I think we can solve that by marking the them static. This way, we can make the closure have the same name regardless of inlining, determined by the source-level containing function.

One place is that currently the compiler doesn't like two symbols with the same name in the same package, even if they are resolved solely by index. But that is an internal detail we can fix.

If for some reason using the same name doesn't work, or it confuses users or tools, we can try using a name that is mainly based on the source-level containing function, with a short suffix for deduplication. Say, for a closure defined in p.F in the source code, an inlined version of it may be named p.F.func1..inl.<hash>, where <hash> is a short hash of the inline stack. Or we could let the linker fill in a unique suffix at link time. We could choose to omit the suffix in DWARF and in traceback printing, to make it more user friendly.

For the next step I'm going to try making the closure symbols static and ensure they are always resolved by index. If that works, we can start changing the names, without worrying about uniqueness.

Thanks.

Comment From: gopherbot

Change https://go.dev/cl/661896 mentions this issue: cmd/compile: mark closure symbols static