What version of Go are you using (go version
)?
$ go version go version go1.20.3 linux/arm64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GO111MODULE="" GOARCH="arm64" GOBIN="" GOCACHE="/tmp/go" GOENV="/home/ubuntu/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="arm64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/usr/local/lib/go/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/usr/local/lib/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/lib/go-1.20" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/lib/go-1.20/pkg/tool/linux_arm64" GOVCS="" GOVERSION="go1.20.3" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="0" GOMOD="/dev/null" GOWORK="" CGO_CFLAGS="-O2 -g" CGO_CPPFLAGS="" CGO_CXXFLAGS="-O2 -g" CGO_FFLAGS="-O2 -g" CGO_LDFLAGS="-O2 -g" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1714904807=/tmp/go-build -gno-record-gcc-switches"
What did you do?
The net.Resolver
accepts an optional Dial
function that says the following:
type Resolver struct {
// Dial optionally specifies an alternate dialer for use by
// Go's built-in DNS resolver to make TCP and UDP connections
// to DNS services. The host in the address parameter will
// always be a literal IP address and not a host name, and the
// port in the address parameter will be a literal port number
// and not a service name.
// If the Conn returned is also a PacketConn, sent and received DNS
// messages must adhere to RFC 1035 section 4.2.1, "UDP usage".
// Otherwise, DNS messages transmitted over Conn must adhere
// to RFC 7766 section 5, "Transport Protocol Selection".
// If nil, the default dialer is used.
Dial func(ctx context.Context, network, address string) (Conn, error)
}
I created a script that logs Dial
calls when using the pure Go resolver: https://go.dev/play/p/0O_ARZyK2eG
If I run this script locally, I see something like this:
$ ./resolve
Dial(udp, 127.0.0.53:53)
Dial(udp, 127.0.0.53:53)
{172.217.24.46 }
{2404:6800:4006:804::200e }
However, if I run the script with strace, I see that Go is making additional connections some other way:
$ strace ./resolve 2>&1 | grep '^connect'
connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(9), sin_addr=inet_addr("172.217.24.46")}, 16) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(9), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2404:6800:4006:804::200e", &sin6_addr), sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
There's is one hardcoded call to net.DialUDP
here which appears to be the source of the additional connections.
What did you expect to see?
I expect to see the Dial
function used for all connections made by the pure Go resolver.
What did you see instead?
I see that the Dial
function is only used in some cases.
Additional context
CL 500576 fixes the issue by using net.Resolver.Dial
in all cases.
For context, this change is important for targets with limited networking capabilities (e.g. GOOS=wasip1
). It means that users can provide their own Dial
function to make use of the pure Go resolver. At the moment the hardcoded net.DialUDP
call makes the pure Go resolver off limits for these targets.
There was some concern in the CL about whether making this change for all targets would break code in the wild. I'm submitting it as a bug report so we can discuss here instead.
cc GOOS=wasip1
maintainers: @achille-roussel @johanbrandhorst @Pryz
cc those that commented on CL 500576: @mateusz834 @ianlancetaylor
Comment From: chriso
If I replace the hardcoded DialUDP
call with r.dial("udp")
then the provided Dial
function is used in all cases.
-c, err = DialUDP("udp", nil, &dst)
+c, err = r.dial(ctx, "udp", dst.IP.String())
This has the additional benefit of threading the lookup context through to the underlying dialer.
If we're concerned about breaking code in the wild, we could instead opt-in by target, and take this path for GOOS=wasip1
only for now (since it has limited networking capabilities, and DialUDP
always fails).
This approach was suggested by @mateusz834:
if runtime.GOOS == "wasip1" {
c, err = r.dial(ctx, "udp", dst.IP.String())
} else {
c, err = DialUDP("udp", nil, &dst)
}
@ianlancetaylor suggested that we might instead require an additional hook:
type Resolver struct {
Dial func(ctx context.Context, network, address string) (Conn, error)
// Extra hook:
DialUDP func(ctx context.Context, network, address string) (Conn, error)
}
or something like this:
type Resolver struct {
Dial func(ctx context.Context, network, address string) (Conn, error)
// Extra hook:
UDPConnect func(ctx context.Context, *UDPAddr) (*UDPAddr, bool)
}
Comment From: gopherbot
Change https://go.dev/cl/500576 mentions this issue: net: prefer Resolver.Dial over DialUDP on wasip1
Comment From: mateusz834
The runtime.GOOS == "wasip1"
guard was just a simple fix idea, but I agree with @ianlancetaylor that having a per platform behaviour in this case is not ideal.
I think that this hook should be named something like IsAddrReachable
, so that the intention is clear.
And probably it should use the netip.Addr at this point.
type Resolver struct {
// IsAddrReachable is used for address sorting by the go resolver.
// When this field is equal to nil, the default dialer is being used. addr is considered reachable,
// when the default dialer sucesfully establishes a UDP connection to addr.
IsAddrReachable func(ctx context.Context, addr netip.Addr) (local netip.Addr, reachable bool)
}
Comment From: chriso
CL 502315 improved the situation for wasip1 by addressing the panic in net.DialUDP
. Since it no longer panics, an error from the hardcoded call only affects the sort order.