Go version

go1.23.4 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/ykelani/.cache/go-build'
GOENV='/home/ykelani/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ykelani/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ykelani/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/ykelani/sdk/go1.23.4'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/ykelani/sdk/go1.23.4/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.4'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/ykelani/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ykelani/TestCGoScaling/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4071249400=/tmp/go-build -gno-record-gcc-switches'

What did you do?

package main

/*
#cgo CFLAGS: -O0
#include <unistd.h>

void a(int depth) {
    if (depth > 0) a(depth - 1);
}
*/
import "C"
import (
    "fmt"
    "runtime/debug"
    "sync"
)

func main() {
    fmt.Println("Running multiple CGo calls")

    var wg sync.WaitGroup

    for i := 0; i < 1000000; i++ {
        go func() {
            wg.Add(1)
            defer wg.Done()
            C.a(100000)
        }()
    }

    wg.Wait()

    debug.FreeOSMemory()

    select {}
}

What did you see happen?

After the program pauses on the select call, RSS is 497MB for 150 OS threads. I see many segments of size 3132KB in the pmap output - my uninformed guess is that these are thread stacks but it's unclear why they are still in physical memory.

dev-dsk-ykelani-1a-f1f9d672 % ps -o nlwp 15514
NLWP
 150
 ```

dev-dsk-ykelani-1a-f1f9d672 % pmap -x 15514 15514: ./build/bin/cgo-scaling Address Kbytes RSS Dirty Mode Mapping 0000000000400000 1352 1352 0 r-x-- cgo-scaling 0000000000751000 4 4 4 r---- cgo-scaling 0000000000752000 44 44 20 rw--- cgo-scaling 000000000075d000 144 72 72 rw--- [ anon ] 00000000120cf000 132 8 8 rw--- [ anon ] 000000c000000000 65536 28900 28900 rw--- [ anon ] 00007fbeb51f5000 4 0 0 ----- [ anon ] 00007fbeb51f6000 10240 3132 3132 rw--- [ anon ] 00007fbeb5bf6000 4 0 0 ----- [ anon ] 00007fbeb5bf7000 10240 3132 3132 rw--- [ anon ] 00007fbeb65f7000 4 0 0 ----- [ anon ] 00007fbeb65f8000 10240 3132 3132 rw--- [ anon ] 00007fbeb6ff8000 4 0 0 ----- [ anon ] 00007fbeb6ff9000 10240 3132 3132 rw--- [ anon ] 00007fbeb79f9000 4 0 0 ----- [ anon ] 00007fbeb79fa000 10240 3132 3132 rw--- [ anon ] 00007fbeb83fa000 4 0 0 ----- [ anon ] 00007fbeb83fb000 10240 3132 3132 rw--- [ anon ] 00007fbeb8dfb000 4 0 0 ----- [ anon ] 00007fbeb8dfc000 10240 3132 3132 rw--- [ anon ] 00007fbeb97fc000 4 0 0 ----- [ anon ] 00007fbeb97fd000 10240 3132 3132 rw--- [ anon ] 00007fbeba1fd000 4 0 0 ----- [ anon ] 00007fbeba1fe000 10240 3132 3132 rw--- [ anon ] 00007fbebabfe000 4 0 0 ----- [ anon ] 00007fbebabff000 10240 3132 3132 rw--- [ anon ] 00007fbebb5ff000 4 0 0 ----- [ anon ] 00007fbebb600000 10240 3132 3132 rw--- [ anon ] 00007fbebc000000 132 4 4 rw--- [ anon ] 00007fbebc021000 65404 0 0 ----- [ anon ] 00007fbec0000000 132 4 4 rw--- [ anon ] 00007fbec0021000 65404 0 0 ----- [ anon ] 00007fbec4000000 132 4 4 rw--- [ anon ] 00007fbec4021000 65404 0 0 ----- [ anon ] 00007fbec83fa000 4 0 0 ----- [ anon ] ... 00007fc1117e5000 10496 3296 3296 rw--- [ anon ] 00007fc112225000 4 0 0 ----- [ anon ] 00007fc112226000 10240 3132 3132 rw--- [ anon ] 00007fc112c26000 4 0 0 ----- [ anon ] 00007fc112c27000 10240 3132 3132 rw--- [ anon ] 00007fc113627000 4 0 0 ----- [ anon ] 00007fc113628000 10240 8 8 rw--- [ anon ] 00007fc114028000 33792 8 8 rw--- [ anon ] 00007fc116128000 263680 0 0 ----- [ anon ] 00007fc1262a8000 4 4 4 rw--- [ anon ] 00007fc1262a9000 524284 0 0 ----- [ anon ] 00007fc1462a8000 4 4 4 rw--- [ anon ] 00007fc1462a9000 293564 0 0 ----- [ anon ] 00007fc158158000 4 4 4 rw--- [ anon ] 00007fc158159000 36692 0 0 ----- [ anon ] 00007fc15a52e000 4 4 4 rw--- [ anon ] 00007fc15a52f000 4068 0 0 ----- [ anon ] 00007fc15a928000 1680 1184 0 r-x-- libc-2.26.so 00007fc15aacc000 2044 0 0 ----- libc-2.26.so 00007fc15accb000 16 16 16 r---- libc-2.26.so 00007fc15accf000 8 8 8 rw--- libc-2.26.so 00007fc15acd1000 16 12 12 rw--- [ anon ] 00007fc15acd5000 96 96 0 r-x-- libpthread-2.26.so 00007fc15aced000 2048 0 0 ----- libpthread-2.26.so 00007fc15aeed000 4 4 4 r---- libpthread-2.26.so 00007fc15aeee000 4 4 4 rw--- libpthread-2.26.so 00007fc15aeef000 16 4 4 rw--- [ anon ] 00007fc15aef3000 144 144 0 r-x-- ld-2.26.so 00007fc15af22000 516 332 332 rw--- [ anon ] 00007fc15afa3000 512 0 0 ----- [ anon ] 00007fc15b023000 4 4 4 rw--- [ anon ] 00007fc15b024000 508 0 0 ----- [ anon ] 00007fc15b0a3000 404 76 76 rw--- [ anon ] 00007fc15b116000 4 4 4 r---- ld-2.26.so 00007fc15b117000 4 4 4 rw--- ld-2.26.so 00007fc15b118000 4 4 4 rw--- [ anon ] 00007ffc4edb4000 3136 3136 3136 rw--- [ stack ] 00007ffc4f17d000 16 0 0 r---- [ anon ] 00007ffc4f181000 8 4 0 r-x-- [ anon ] ffffffffff600000 4 0 0 r-x-- [ anon ]


total kB 11088092 497548 494744 ```

What did you expect to see?

RSS to fall much lower for the process - low enough just to support the Go runtime. This is a toy example but I'm seeing the same behaviour slowly "leak" memory for a real long-lived Go process that uses CGo.

Comment From: gabyhelp

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: seankhliao

what if you run it with GODEBUG=madvdontneed=0

Comment From: kelanyll

@seankhliao It uses more memory (527MB) with GODEBUG=madvdontneed=0 which I guess makes sense as the OS should be less aggressive about reclaiming memory.

Comment From: prattmic

While goroutine stacks can grow and shrink, cgo calls do not run on the goroutine stack [1]. Instead cgo calls run on what we call the "system stack". i.e., the pthread-allocated stack for the thread that the goroutine is currently executing on.

This pthread stack does not ever shrink (or grow), which is why you see the resident memory stick around.

I suppose that we could theoretically have the GC / debug.FreeOSMemory go around and MADV_DONTNEED currently-unused pages of the system stack. They do not attempt to do so today. It would be a bit tricky to determine which pages to keep and which to drop, as we can't measure the high watermark that cgo calls reach.

This is somewhat related to #14592, as exiting threads would free their stack.

[1] Because it might be too small, and we can't grow the stack while running C code like we can while running Go code.

cc @golang/runtime

Comment From: kelanyll

Thanks @prattmic. It does seem related to #14592 - your suggestion would significantly reduce the overhead of keeping these threads around.

Comment From: liqimore

I met the same issue. I upgraded golang from 1.17 to 1.23. After recompile the binary, the memory(RSS) usage keeps growing. Golang itself uses about 200MB, however, the process take up to 10GB. It seems that there is a leak in cgo.

Comment From: mknyszek

@liqimore Please file a new issue with additional details. A lot has changed in those 6 releases, but I also don't think cgo has changed much. Since you're upgrading past Go 1.21, I suggest trying GODEBUG=disablethp=1 if you're on Linux (and if that works, take a look at https://go.dev/doc/gc-guide#Linux_transparent_huge_pages).