Go version

go1.25rc2

Output of go env in your module/workspace:

irrelevant to this bug

What did you do?

go build the following program:

package main

import "math"

const lz4Hashlog = 12

type lz4Ctx struct {
        source    []byte
        hashTable [1 << lz4Hashlog]uint32
        freqs     [256]uint32
        nrBytes   uint
}

func (ctx *lz4Ctx) calcEntropy() (e float64) {
        if ctx.nrBytes == 0 {
                return 0
        }
        nrBytes := float64(ctx.nrBytes)
        for _, f := range ctx.freqs {
                p := float64(f) / nrBytes
                if p != 0 {
                        e -= p * math.Log2(p)
                }
        }
        return e
}

func main() {
        var c lz4Ctx
        println(c.calcEntropy())
}

What did you see happen?

On an amd64 host, objdump -d of the executable shows this:

000000000046fcc0 <main.(*lz4Ctx).calcEntropy>:
  46fcc0:       4c 8d a4 24 50 fc ff    lea    -0x3b0(%rsp),%r12
  46fcc7:       ff
  46fcc8:       4d 3b 66 10             cmp    0x10(%r14),%r12
  46fccc:       0f 86 08 01 00 00       jbe    46fdda <main.(*lz4Ctx).calcEntropy+0x11a>
  46fcd2:       55                      push   %rbp
  46fcd3:       48 89 e5                mov    %rsp,%rbp
  46fcd6:       48 81 ec 28 04 00 00    sub    $0x428,%rsp    <--- allocate 40 bytes + 1K for ctx.freqs on the stack
...
  46fd2e:       e8 0d cf ff ff          call   46cc40 <runtime.duffcopy>    <--- make a copy of ctx.freqs

When building for arm64, the copy is generated, too.

What did you expect to see?

No copies of ctx.freqs, like in this code:

func (ctx *lz4Ctx) calcEntropy() (e float64) {
        ...
        freqs := ctx.freqs[:]
        nrBytes := float64(ctx.nrBytes)
        for _, f := range freqs {
                ....

objdump -d of this version:

000000000046fcc0 <main.(*lz4Ctx).calcEntropy>:
  46fcc0:       49 3b 66 10             cmp    0x10(%r14),%rsp
  46fcc4:       0f 86 cc 00 00 00       jbe    46fd96 <main.(*lz4Ctx).calcEntropy+0xd6>
  46fcca:       55                      push   %rbp
  46fccb:       48 89 e5                mov    %rsp,%rbp
  46fcce:       48 83 ec 28             sub    $0x28,%rsp  <--- only allocate stack for calcEntropy's variables
  ...

Comment From: gabyhelp

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: artem-anisimov-0x7f

Well, ok, this is #33838.

Comment From: TapirLiu

Just do for _, f := range &ctx.freqs or for _, f := range ctx.freqs[:]. Try not to depend on compilers.

Comment From: TapirLiu

IMHO, the code should not be optimized at all. It is best to stick to the semantics to prevent newbies from writing bad code. ;)

Comment From: randall77

The semantics require a copy in this case.

        for _, f := range ctx.freqs {
                p := float64(f) / nrBytes
                if p != 0 {
                        e -= p * math.Log2(p)
                }
        }

What if math.Log2 modifies ctx.freqs? You and I know it doesn't, but the compiler is not so smart.