What version of Go are you using (go version)?

$ go version
go version go1.19.5 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/patrik/.cache/go-build"
GOENV="/home/patrik/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/patrik/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/patrik/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.5"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/patrik/git/oathkeeper/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1543900704=/tmp/go-build -gno-record-gcc-switches"

What did you do?

func TestWriteTimeoutCancellation(t *testing.T) {
    backend := httptest.NewUnstartedServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        r.Body.Close() // to show that it is not https://github.com/golang/go/issues/23262

        time.Sleep(1 * time.Second)
        _, err := w.Write([]byte("ok"))
        ctxErr := r.Context().Err()

        _, _ = ctxErr, err // breakpoint here to see the errors, they are both nil
    }))
    backend.Config.WriteTimeout = 500 * time.Millisecond
    backend.EnableHTTP2 = true // you can also try false

    backend.Start()
    defer backend.Close()

    start := time.Now()
    res, err := backend.Client().Get(backend.URL)
    t.Logf("took %s", time.Since(start)) // for me this always takes 1s, although the write timeout is 500ms
    _, _ = res, err // breakpoint here, err is EOF, res is nil
}

What did you expect to see?

Client side: all requests are cancelled latest after the write timeout, why wait until the handler is done to not receive the result? Handler side: r.Context() is cancelled, so it can stop handling because the result is not returned anyway

I have seen https://github.com/golang/go/issues/21389 which is kinda related, but not the same issue. I do understand that w.Write() does not immediately return an error because of buffering. I would expect the request context to be canceled immediately though.

What did you see instead?

Client: request took 1s, although the write timeout is 500ms Handler: no way to tell whether the timeout was triggered, potentially expensive or destructive work continues

As is, I don't understand the use of WriteTimeout at all. It seems there is no effect, except for maybe big responses that might already be partially received by the client (see https://github.com/traefik/traefik/issues/9191#issuecomment-1197891601).

I don't think this is a duplicate of #23262 because I see the behavior also when closing the body immediately.

Workaround

It seems pretty simple to just write a middleware that wraps the request context with a timeout:

func timeoutMiddleware(timeout time.Duration, next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ctx, cancel := context.WithTimeout(r.Context(), timeout)
        defer cancel()
        r = r.WithContext(ctx)
        next.ServeHTTP(w, r)
    })
}

cc @alnr

Comment From: seankhliao

note there's already TimeoutHandler https://pkg.go.dev/net/http#TimeoutHandler

Comment From: dr2chase

@neild

Comment From: zepatrik

@seankhliao thanks for pointing to the net.TimeoutHandler. The problem is the way it communicates that the timeout was reached:

After such a timeout, writes by h to its ResponseWriter will return ErrHandlerTimeout.

In my case I have a handler that basically does:

  1. validate & parse request
  2. do some database calls
  3. write result

In this case what I need is the request context to be cancelled on timeout to abort the expensive and potentially destructive database calls. During the database calls I typically don't write any response data, so I would not notice that the TimeoutHandler already "cancelled" the request.

Comment From: seankhliao

It does cancel the request context https://cs.opensource.google/go/go/+/refs/tags/go1.20.3:src/net/http/server.go;l=3379

Comment From: zepatrik

Ah thanks for pointing that out, I guess that should be added to the docs.

The general question of when and why to use WriteTimeout remains, as the request is not aborted after WriteTimeout. Somehow it is really weird that the client only experiences a connection drop after the handler returns, even if that is way beyond the WriteTimeout. I and others find that very strange:

A sleep longer than WriteTimeout before writing the response results in the client getting no data, but the client still takes the sleep-time to disconnect rather than the timeout-time, which seems very strange to me.

https://adam-p.ca/blog/2022/01/golang-http-server-timeouts/

So yes, in this particular case, the responsiveness goal of using a WriteTimeout is completely defeated, so it basically does not work.

https://github.com/traefik/traefik/issues/9191#issuecomment-1197891601

To me it looks like one has to use the net.TimeoutHandler instead of the server's WriteTimeout, because either there is a major misconception about the WriteTimeout out there or there is a bug. I really cannot see how setting WriteTimeout protects my server's resources.

Comment From: alnr

I think net.TimeoutHandler works correctly and is useful.

But really the server should also cancel the request context for other handlers (not registered through TimeoutHandler) when the WriteTimeout is hit, shouldn't it?

Comment From: neild

Perhaps it would make sense to cancel the request context when the write timeout is exceeded, since it is no longer possible for the handler to respond after that point. I'm not certain how practical this is to change at this point.

The Request.Context documentation is fairly clear on when the request context is canceled:

For incoming server requests, the context is canceled when the client's connection closes, the request is canceled (with HTTP/2), or when the ServeHTTP method returns.

This does not include when the read or write timeout on a request is exceeded.

Comment From: davidhsingyuchen

The general question of when and why to use WriteTimeout remains To me it looks like one has to use the net.TimeoutHandler instead of the server's WriteTimeout

Sharing some findings for people reading this issue trying to figure out when/how to configure WriteTimeout:

Most situations would probably require using TimeoutHandler (due to the existence of this issue), and since its current implementation buffers the response body and only writes it to the wire after the user-supplied handler returns, request handling is naturally divided into 2 stages (without writing custom buffering code):

  1. Figure out what needs to be written to the response body and buffer it.
  2. Write the response body to the wire (i.e., what WriteTimeout affects).

As a result, One may configure TimeoutHandler according to the estimate of 1. and WriteTimeout according to that of 1. + 2., which could be useful when more granular control of timeouts is wanted (i.e., compared to probably setting both to be 1. + 2.).

As a side note, the "native" http.ResponseWriter also has a buffer, but it seems to flush every 4096 bytes, which blurs the line between 1. and 2..

Last, the implementation of TimeoutHandler can change, so the analysis above may be invalided in the future.

More details and links to related code: https://stackoverflow.com/a/79269487/6355435

Comment From: mainde

+1 to this issue because I just burnt two hours trying to understand what was going wrong in my code 😢

I initially misunderstood what WriteTimeout meant, I assumed it was the maximum time allowed for any call to write to the ResponseWriter, in order to protect against malicious/slow/disconnected clients.

Even after understanding that it is fundamentally the total time for the request full lifecycle, I did not realise that my code was exceeding this timeout. The client was remaining connected long past WriteTimeout seconds, (twice as much in fact, I suspect Chrome was invisibly retrying), my call to WriteHeader on http.ResponseWriter was silently having no effect, and the request context was not being cancelled. Even tracing through WriteHeader gave me no evidence that anything was wrong unfortunately.

A pragmatic solution that I would suggest is a bigger warning in the documentation, clarifying that setting this value won't cause the request context to be cancelled, and an explicit nudge towards http.TimeoutHandler