After upgrading Spring Boot from 3.3.10 to 3.4.6 and Spring Cloud from 2023.0.5 to 2024.0.1 we face the following errors in Production logs related to Feign client

Caused by: feign.RetryableException: Connection refused executing POST http://SERVICE-NAME/Path/Path/last-session

It was difficult to find the rootcause because this bug was flaky and happened on 1-2 instances out of 40.
After investigation it turned out that in spring-beans-6.2.7 there was a breaking change in instantiating singleton beans. In new version locks for singleton beans were omitted for some cases and it leads to inconsistent behaviour of the service. In our case it was caused that some Feign clients were declared with @Lazy and were not instantiated on startup.

So now to fix this issue we need to add spring.locking.strict=true to all our services (more then 400) to eliminate this breaking change. It's not clear for our company (Playtika) why it wasn't set by default!

Comment From: kptfh

https://github.com/OpenFeign/feign/issues/1868

Comment From: bclozel

I'm not aware of any problem in Spring Framework nor any change that could have caused this. I'm closing this now because I don't see anything actionable on our side.

Comment From: kptfh

This commit in Spring Beans module introduces breaking change.

In method getObjectFromFactoryBean there is no strict lock now while instantiating singleton object not from main thread. All @Lazy objects falls to this case.

If FactoryBean (like FeignClientFactoryBean ) is not thread-safe it may lead to incorrect beans instantiated. FeignClientFactoryBean not thread safe

I just want to stress that all @Lazy beans with not thread-safe factories may be incorrectly instantiated now. Taking into account the wide usage of @FeignClient the impact for services may be significant while upgrading to new Spring Boot. It should be a good practice to keep code backward compatible so you need to set spring.locking.strict=true by default.

Comment From: bclozel

@kptfh now that we know that this is not strictly related to feign, can you comment on #34902 then and explain the factorybean setup?

Comment From: kptfh

You are right. It's not related to just Feign. It's related to all not thread-safe factory beans. And taking into account that there is no requirement to keep factory beans thread-safe, it potentially may affect a lot of beans (not Feign only).

The problem is that object = doGetObjectFromFactoryBean(factory, requiredType, beanName) is not protected by any locks now and in case of concurrent calls may lead to side effects in factory beans (bean factory may change it's state in unpredictable way).

Here is the config that leads to concurrent calls to FeignClientBeanFactory:

    @Bean
    public EvaluatorService evaluatorService(EvaluationClient client) {
        return new EvaluatorService(client);
    }

   @Bean
    public HealthIndicator clientWarmupHealthIndicator(@Lazy EvaluationClient client) {
        return new ClientWarmupHealthIndicator(client);
    }

I'm not sure if 34902 is related to my issue.