Go version

go version go1.22.4 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/ben/.cache/go-build'
GOENV='/home/ben/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ben/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ben/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/ben/sdk/go1.22.4'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/ben/sdk/go1.22.4/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ben/w/pebble/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2146132670=/tmp/go-build -gno-record-gcc-switches'

What did you do?

When I call user.Lookup on a non-existent user (eg: "baduser"), I expect it to return user.UnknownUserError, but it instead returns a generic non-nil error. I can use the following test program (Go playground link) to easily reproduce this on my system:

package main

import (
    "fmt"
    "os"
    "os/user"
)

func main() {
    usr, err := user.Lookup(os.Args[1])
    if _, ok := err.(user.UnknownUserError); ok {
        fmt.Println("unknown user")
        return
    }
    if err != nil {
        fmt.Println("error:", err)
        return
    }
    fmt.Printf("user: %#v\n", usr)
}

What did you see happen?

When I run the test script with CGO_ENABLED=1 (the default on my system) it's instead returning a generic errors.errorString (with the message "user: lookup username guy: no such file or directory"):

# Default behaviour (not expected)
$ CGO_ENABLED=1 go run t.go baduser
error: user: lookup username baduser: no such file or directory

# With cgo disabled I get the expected behaviour
$ CGO_ENABLED=0 go run t.go baduser
unknown user

What did you expect to see?

The user.Lookup call should return an UnknownUserError error value, whether or not CGO_ENABLED is set.

When I follow it through in my debugger, it's using the lookupUser from cgo_lookup_unix.go, which is calling _C_getpwnam_r, and that's returning a syscall.Errno error with ENOENT. The getpwnam_r docs seem to indicate this function can return ENOENT under such circumstances, so maybe the code should check for that, rather than just the f != 0 test?

Why it's changed now on my system, but was working before, I'm not sure -- my colleague, also on Ubuntu 24.04 with the same kernel (Linux 6.8.0-35-generic x86_64) doesn't seem to have this problem. My first thought was a different libc version, but my colleague (who doesn't have this issue) and I both are running the same version -- this one:

$ dpkg -s libc6 | grep Version
Version: 2.39-0ubuntu8.2

Note that I recently upgraded my OS by doing a fresh install of Ubuntu 24.04. When I was on 22.04 I didn't have this problem (though as I noted, my colleague is also on 24.04 and he doesn't have this issue -- it works both with cgo enabled and disabled).

For reference, with an existing user both cgo and non-cgo work fine:

$ CGO_ENABLED=1 go run t.go ben
user: &user.User{Uid:"1000", Gid:"1000", Username:"ben", Name:"Ben", HomeDir:"/home/ben"}
$ CGO_ENABLED=0 go run t.go ben
user: &user.User{Uid:"1000", Gid:"1000", Username:"ben", Name:"Ben", HomeDir:"/home/ben"}

Comment From: gabyhelp

Similar Issues

  • https://github.com/golang/go/issues/24383
  • https://github.com/golang/go/issues/40334
  • https://github.com/golang/go/issues/25973
  • https://github.com/golang/go/issues/24083
  • https://github.com/golang/go/issues/43543

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: benhoyt

Note that I looked at all of those "similar issues", but I think this is different from all of them. The closest is https://github.com/golang/go/issues/25973, but it's a bit hard to tell, and that was closed quickly by the creator with "I was doing something wrong" and no further info.

Comment From: dimaqq

Prior art outside golang:

https://github.com/crystal-lang/crystal/issues/8069

The link to Ruby implementation from the above:

https://github.com/ruby/ruby/blob/81562f943e4f33fbfd00fdfd115890ba0b76916c/process.c#L5934-L5950

Comment From: benhoyt

@kortschak helped me work through this on Gophers Slack. getpwnam_r definitely returns ENOENT (errno 2) under some circumstances. Via debugging, I can see that's what it's returning on my machine in the above example. It's documented this way:

On success, getpwnam_r() and getpwuid_r() return zero, and set result to pwd. If no matching password record was found, these functions return 0 and store NULL in result. In case of error, an error number is returned, and NULL is stored in *result.

So we should be checking the return value (error number) against ENOENT, probably in lookupUser.

I still don't know what's different about my machine than my colleague's. Maybe a different nsswitch.conf? I'm out of my depth here. Suffice to say that mine is a very fresh install of Ubuntu 24.04, and I haven't mucked with those settings that I know of.

For reference, here's a little C program that Dan wrote -- it reproduces this issue on my machine.

$ cat t.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <pwd.h>
#include <errno.h>

void main() {
    struct passwd pwd;
    struct passwd *result;
    size_t buflen = 4096;
    char *buf = (char*)malloc(buflen);
    int ret = getpwnam_r("baduser", &pwd, buf, buflen, &result);
    printf("%d %d %s %d\n", ret, errno, pwd.pw_name, pwd.pw_uid);
}
$ gcc t.c
$ ./a.out
2 2 pulse 122
$ # note that "pulse" is undefined bytes (it's the last username in my /etc/passwd)

I'll try to put up a Go change to fix this.

Comment From: benhoyt

Thanks @dimaqq, that Crystal issue looks like exactly the same problem.

Comment From: gopherbot

Change https://go.dev/cl/591555 mentions this issue: os/user: make Lookup* functions properly handle ENOENT

Comment From: benhoyt

A colleague and I got to the bottom of why/when this is happening. On my machine, for whatever reason (on a relatively fresh Ubuntu 24.04 install), I have the following line in my nsswitch.conf:

$ grep passwd /etc/nsswitch.conf 
passwd:         files systemd sss

It's the sss that's doing it. I have libsss installed, though sssd is not running. To reproduce this, 1) add sss to the passwd: line as per above, and 2) install libnss-sss with sudo apt install libnss-sss if it's missing. My colleague did that that and could repro the issue of ENOENT being returned, and similarly my system no longer returns ENOENT if I remove the sss (it then returns no error and not-found).