Based on a discussion in #51246 (in particular, https://github.com/golang/go/issues/51246#issuecomment-1050117436), this is a proposal to implement pidfd support for process methods in the os package -- StartProcess, FindProcess, Kill, Wait.

  1. Reuse the handle field of struct Process to keep pidfd. Since 0 is a valid value, use ^uintptr(0) as a default value (if pidfd is not known).

  2. Instrument ~~syscall.StartProcess~~ os.StartProcess to get pidfd from the kernel ~~and return it as a handle~~. The initial idea was to modify syscall.StartProcess as it returns a handle (which we're reusing for pidfd) but that would be backward-incompatible change and can result in e.g. leaking pidfd as callers are not expecting it.

  3. Use pidfdSendSignal from (*process).Signal, if handle is known and the syscall is available; fallback to old code using syscall.Kill if syscall is not available (e.g. disabled by seccomp).

  4. Use waitid(P_PIDFD, ...) in Wait, if pidfd is set; fall back to old code if P_PIDFD is not supported.

  5. an implementation of 1-4: ~~CL 528438~~ CL 570036

  6. Amend FindProcess to use pidfd_open(2), optionally returning ErrProcessGone, and modify the documentation accordingly (adding that it can return ErrProcessGone on Linux).

  7. an implementation of 5: ~~CL 528798~~ CL 570681

Attention doc team: items 1-5 above ^^^ are fully implemented as of Go 1.23rc1, and need to be part of go 1.23 release notes.

  1. (Optional) Add (*Process).Handle() uintptr method to return process handle on Windows and pidfd on Linux. This might be useful for low-level operations that require handle/pidfd (e.g. pidfd_getfd on Linux), or to ensure that pidfd (rather than pid) is being used for kill/wait.

Comment From: gopherbot

Change https://go.dev/cl/528436 mentions this issue: internal/syscall/unix: add PidFDSendSignal for Linux

Comment From: ianlancetaylor

I don't think there is an API change here (before optional item 7) so taking this out of the proposal process.

CC @golang/runtime

Comment From: kolyshkin

I don't think there is an API change here (before optional item 7) so taking this out of the proposal process.

@ianlancetaylor can you please elaborate? Is this about FindProcess returning ErrProcessGone, or something else?

If this is about FindProcess returning an error, as I pointed out in 4 above, this is entirely optional. We can make it succeed every time.

Comment From: ianlancetaylor

What I mean is that we only need a proposal for an API change or a significant functionality change. This issue doesn't seem to be either, so we should treat it as an ordinary feature request issue, rather than running it through the proposal process. I hope that makes sense. Thanks.

Comment From: kolyshkin

Thanks for the explanation, makes sense.I will keep working on implementation.

Comment From: gopherbot

Change https://go.dev/cl/528438 mentions this issue: syscall: make StartProcess return pidfd on Linux

Comment From: gopherbot

Change https://go.dev/cl/528798 mentions this issue: os: make use of pidfd on linux

Comment From: metux

I'm a bit unhappy w/ #4: API shouldn't behave different on individual OS. So I'd put it this way:

ErrProcessGone could happen on any OS. Just practically only happens on Linux for now. So, developers always need to be prepared for this.

Comment From: kolyshkin

@golang/runtime could you please take a look at this proposal as well as CLs implementing it? This is https://go.dev/cl/528436 and the two related ones, implementing items 1 to 4 described above.

I would appreciate any feedback.

Comment From: prattmic

Hi @kolyshkin,

Sorry for the delay. I took a look at your CLs. I'm leaving some feedback here since it is about potential API changes.

  1. https://go.dev/cl/528438 changes syscall.StartProcess to automatically set ProcAttr.Sys.PidFD in order to always return a pidfd via handle. I don't think this is a safe change. Existing callers will not realize that handle is a file descriptor which needs to be closed, and will thus leak the file descriptor on every call. I think it would be better for os.startProcess to set ProcAttr.Sys.PidFD prior to calling syscall.StartProcess. os.Process will retain ownership of the file descriptor and know when to close it.

  2. Should os.Process.Wait close the pidfd? Right now, only os.Process.Release does so.

  3. Regarding os.FindProcess returning ErrProcessGone, I think this is a nice improvement to the API, but I worry that this might break some callers. If we do this, I think we should add a new GODEBUG flag to switch back to the old behavior.

Comment From: kolyshkin

Hi @kolyshkin,

Sorry for the delay. I took a look at your CLs. I'm leaving some feedback here since it is about potential API changes.

  1. https://go.dev/cl/528438 changes syscall.StartProcess to automatically set ProcAttr.Sys.PidFD in order to always return a pidfd via handle. I don't think this is a safe change. Existing callers will not realize that handle is a file descriptor which needs to be closed, and will thus leak the file descriptor on every call. I think it would be better for os.startProcess to set ProcAttr.Sys.PidFD prior to calling syscall.StartProcess. os.Process will retain ownership of the file descriptor and know when to close it.
  2. Should os.Process.Wait close the pidfd? Right now, only os.Process.Release does so.
  3. Regarding os.FindProcess returning ErrProcessGone, I think this is a nice improvement to the API, but I worry that this might break some callers. If we do this, I think we should add a new GODEBUG flag to switch back to the old behavior.

@prattmic thanks for review comments, I updated the code based on your suggestions (except for the item 3, see below). In the process, I found a bug in my older PidFD implementation (see https://go.dev/cl/542698 for the fix). The series is now ready for (re-)review, I'd be grateful about any feedback.

As for using GODEBUG, I feel the overall change is major and might break some setups (in particular, new syscalls are used, which might be blocked on some older systems, and the runtime workarounds may be too complex (see e.g. canUsePidfdSendSignal in https://go.dev/cl/528438). Therefore I guess it might make sense to gate the whole "use pidfd from os" feature (including FindProcess change) by having something like GODEBUG=osusepidfd=0.

What do you think?

CC @golang/runtime

Comment From: prattmic

Therefore I guess it might make sense to gate the whole "use pidfd from os" feature (including FindProcess change) by having something like GODEBUG=osusepidfd=0.

I agree, this seems reasonable.

Comment From: kolyshkin

I agree, this seems reasonable.

@prattmic Implemented as discussed; PTAL https://go.dev/cl/528438

Comment From: gopherbot

Change https://go.dev/cl/566179 mentions this issue: os: FindProcess must not return nil

Comment From: gopherbot

Change https://go.dev/cl/566180 mentions this issue: os: add TestFindProcessDone for Linux

Comment From: gopherbot

Change https://go.dev/cl/566181 mentions this issue: os: fix FindError doc typo

Comment From: gopherbot

Change https://go.dev/cl/566182 mentions this issue: os: fix TestUNIXProcessAlive wrt FindProcess

Comment From: mvdan

Therefore I guess it might make sense to gate the whole "use pidfd from os" feature (including FindProcess change) by having something like GODEBUG=osusepidfd=0.

Seeing the main CL land in master, I was a bit surprised to not see any GODEBUG support included, given the discussion here. For those reading along like me, it seems like it was discarded in favor of a fallback to the old mechanism, per https://go-review.googlesource.com/c/go/+/528438/comment/12693995_7cc3bdc5/.

Comment From: gopherbot

Change https://go.dev/cl/566476 mentions this issue: Revert "os: make FindProcess use pidfd on Linux"

Comment From: gopherbot

Change https://go.dev/cl/566477 mentions this issue: Revert "os: make use of pidfd on linux"

Comment From: ianlancetaylor

@mvdan Note that CL 542699 (now rolled back) did use a GODEBUG setting for the behavior change in FindProcess.

Comment From: gopherbot

Change https://go.dev/cl/570036 mentions this issue: os: make use of pidfd on linux

Comment From: gopherbot

Change https://go.dev/cl/570681 mentions this issue: os: make FindProcess use pidfd on Linux

Comment From: kolyshkin

As of today, items 1-5 in the description are implemented (with a few bugfixes on top) and will be part of Go 1.23.

For item 6, I think I will create a separate proposal.

Comment From: mauri870

It is common to read a fd list from /proc/self/fd before/after an action and detect leaking fds by subtracting the values, eg:

func getFdCount() (count int) {
    count = -1
    if entries, err := os.ReadDir("/proc/self/fd"); err == nil {
        count = len(entries)
    }
    return
}

In 1.23 there will be an additional pidfd reported in this list so the returning count will be off by one.

Perhaps we can mention it briefly in the release notes of Go 1.23?

Another valid argument to support https://github.com/golang/go/issues/67639.

Comment From: kolyshkin

In 1.23 there will be an additional pidfd reported in this list

This will only happen if your code will start (but not wait for) a new process in between "before" and "after" (that is, on Linux, if pidfd-related syscalls work).

Apparently this is exactly what is happening with one of containerd test cases (https://github.com/containerd/containerd/issues/10345).

Comment From: gopherbot

Change https://go.dev/cl/603717 mentions this issue: _content/doc/go1.23: os: document pidfd

Comment From: kolyshkin

Attention doc team: items 1-5 above ^^^ are fully implemented as of Go 1.23rc1, and need to be part of go 1.23 release notes.

See https://go.dev/cl/603717

Comment From: kolyshkin

Not sure why this has NeedsInvestigation label and Backlog milestone. The functionality (items 1-5) is fully implemented as of Go 1.23, and the remaining item (6) deserves a separate proposal.

Comment From: kolyshkin

Not sure why this has NeedsInvestigation label and Backlog milestone. The functionality (items 1-5) is fully implemented as of Go 1.23, and the remaining item (6) deserves a separate proposal.

And this one can be closed, I guess.