In the docs about deploying to cloud, specifically the Kubernetes lifecycle, an example of a preHook sleep of 10 seconds is provided, to avoid traffic being routed to a pod that has begun its shutdown processing.
It also mentions in a note
When Kubernetes sends a SIGTERM signal to the pod, it waits for a specified time called the termination grace period (the default for which is 30 seconds).
I believe that this is incorrect given the context of this example, and that the suggested setup with a sleep given the default values of kubernetes and spring can result in adverse effects.
Reading the kubernetes docs on hooks we can read
This grace period applies to the total time it takes for both the PreStop hook to execute and for the Container to stop normally. If, for example, terminationGracePeriodSeconds is 60, and the hook takes 55 seconds to complete, and the Container takes 10 seconds to stop normally after receiving the signal, then the Container will be killed before it can stop normally, since terminationGracePeriodSeconds is less than the total time (55+10) it takes for these two things to happen.
Given the default terminationGracePeriodSeconds of 30 seconds, and the spring boot default
timeout-per-shutdown-phase of 30 seconds, with the suggested setup we would get
t0: terminationGracePeriodSeconds starts counting down, preStop hook handler is sent, sleep timer begins
t0 + 10s: SIGTERM is sent, spring graceful shutdown begins, timeout-per-shutdown-phase countdown starts
t0 + 30s: SIGKILL is sent, if the application at this point still has inflight requests ongoing it would be killed
t0 + 40s: This is where timeout-per-shutdown-phase countdown would come to an end and spring would shutdown even if it still had inflight requests, but this will never happen since the container was killed 10 seconds ago.
My understanding is that if we were to add a sleep like in the example, we would also want to either
a) Increase terminationGracePeriodSeconds to at least 40s
or
b) reduce timeout-per-shutdown-phase to 20s.
Comment From: vahidmah
@philwebb, I tried to simplify and clarify the process here 👇. If this seems useful, I can follow up with a PR to update the documentation.
Lets look at shutdown flow which is sequence of nested timers and event, starting from the outside (Kubernetes) and moving to the inside (Spring Boot application).
Layer 1: Kubernetes Node (Kubelet)
terminationGracePeriodSeconds(e.g. 45s): This is the master clock. It's the total time budget the kubelet gives the pod to shut down completely. When this timer expires, aSIGKILLis sent, and the container forcefully terminated, no matter what it's doing.
Layer 2: Kubernetes Pod preStop Hook
-
preStop: sleep: seconds: 10: When the shutdown starts, thekubeletfirst executes this hook. It waits for 10 seconds.- Time Remaining in Master Clock: 45s - 10s = 35s.
- During this
sleep, the Kubernetes Service and Ingress controllers remove the pod from the load balancer's endpoint list. Your app is still running and serving any final in-flight requests, but no new requests should arrive.
Layer 3: The Application (Spring Boot)
- After the
preStophook'ssleepfinishes, thekubeletsends theSIGTERMsignal to your Spring Boot application. - Spring Boot catches
SIGTERMand begins its own graceful shutdown procedure. It starts shutting down itsSmartLifecyclebeans, phase by phase (from highest to lowest). spring.lifecycle.timeout-per-shutdown-phase(e.g., 30s): This timer now governs the internal shutdown.- Let's say you have 3 shutdown phases. In the worst case, your application's internal shutdown could take up to 3 * 30s = 90s.
The Critical Calculation
The configuration is only safe if the total time required is less than the master clock.
Total Application Shutdown Time <= Time Remaining in Master Clock
(Sum of all timeout-per-shutdown-phase duration) < (terminationGracePeriodSeconds - preStop sleep duration)
Using our example values:
- terminationGracePeriodSeconds: 45s
- preStop sleep: 10s
- timeout-per-shutdown-phase: 30s
- Number of phases: 3 (hypothetically)
- Time remaining for app shutdown: 45s - 10s = 35s.
- Maximum time the app might take: 3 phases * 30s/phase = 90s.
Conclusion: This configuration is unsafe. The 90s potentially needed by Spring Boot is far greater than the 35s allowed by Kubernetes. The application will almost certainly be killed forcefully by SIGKILL before it can shut down gracefully.