Netty connection in pool reset by peer in Spring Cloud Gateway Application

See original GitHub issue

We use Spring Cloud Gateway with Spring Security OAuth2.

SpringBoot version: 2.2.1 Spring Cloud version: Hoxton.Release Netty: reactor-netty-0.9.2.RELEASE

The Spring security OAuth2 filter post request to UAA server, there is load balancer in between. While initially connection all good but after about 10-15 min, Netty used Spring OAuth2 filter got connection reset by peer and the other end is load balancer. If try again, then it will work. But after another 10-15 min, the problem happens again.

The log shows reactor.netty.resources.PooledConnectionProvider uses some active channel but it is actually not active. We are seeking solution to work around it. So far we tried to customize Netty ClientHttpConnector (code as below) to disable pool, but no success. It seems that bean has no effect on Netty used by Spring security.

I uploaded demo project at GitHub link below, https://github.com/spring-cloud/spring-cloud-gateway/issues/1493

That project cannot produce problem at local as there is no load balancer between but it shows the way to customize ClientHttpConnector used by Spring Security to disable Netty Pool does not work.

@bean
public ClientHttpConnector clientHttpConnector() {
return new ReactorClientHttpConnector(HttpClient.create(ConnectionProvider.newConnection()));
}

From Netty side, is there any solution or work around to this issue? Since our project is due soon in production, any work around we are considering. If you could point out where in Netty code, we can disable the pool as a temporary solutions so that we can move forward and revisit this issue later on.

Here is the logs:

2019-12-19 18:07:53.460 DEBUG 7 --- [reactor-http-epoll-1] o.s.w.r.f.client.ExchangeFunctions       : [1b45ced7] HTTP POST https://server-name/oauth/token
2019-12-19 18:07:53.462 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 - R:xxx.xxx.xxx.xxx:443] Channel acquired, now 1 active connections and 1 inactive connections
2019-12-19 18:07:53.463 DEBUG 7 --- [reactor-http-epoll-4] r.netty.http.client.HttpClientConnect    : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 - R:xxx.xxx.xxx.xxx:443] Handler is being applied: {uri=https://server-name/oauth/token, method=POST}
2019-12-19 18:07:53.466 DEBUG 7 --- [reactor-http-epoll-4] o.s.http.codec.FormHttpMessageWriter     : [1b45ced7] Writing form fields [grant_type, code, redirect_uri] (content masked)
2019-12-19 18:07:53.471 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 - R:xxx.xxx.xxx.xxx:443] onStateChange(POST{uri=/oauth/token, connection=PooledConnection{channel=[id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 - R:xxx.xxx.xxx.xxx:443]}}, [request_sent])
2019-12-19 18:07:53.474 DEBUG 7 --- [reactor-http-epoll-4] r.netty.http.client.HttpClientConnect    : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 - R:xxx.xxx.xxx.xxx:443] The connection observed an error, the request will be retried

io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer

2019-12-19 18:07:53.475 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x9bf92ed9, L:/xxx.xxx.xxx.xxx:38600 - R:xxx.xxx.xxx.xxx:443] Channel acquired, now 2 active connections and 0 inactive connections
2019-12-19 18:07:53.476 DEBUG 7 --- [reactor-http-epoll-4] r.netty.http.client.HttpClientConnect    : [id: 0x9bf92ed9, L:/xxx.xxx.xxx.xxx:38600 - R:xxx.xxx.xxx.xxx:443] Handler is being applied: {uri=https://server-name/oauth/token, method=POST}
2019-12-19 18:07:53.479 DEBUG 7 --- [reactor-http-epoll-4] o.s.http.codec.FormHttpMessageWriter     : [1b45ced7] Writing form fields [grant_type, code, redirect_uri] (content masked)
2019-12-19 18:07:53.492 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x9bf92ed9, L:/xxx.xxx.xxx.xxx:38600 - R:xxx.xxx.xxx.xxx:443] onStateChange(POST{uri=/oauth/token, connection=PooledConnection{channel=[id: 0x9bf92ed9, L:/xxx.xxx.xxx.xxx:38600 - R:xxx.xxx.xxx.xxx:443]}}, [request_sent])
2019-12-19 18:07:53.507 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 ! R:xxx.xxx.xxx.xxx:443] Channel closed, now 2 active connections and 0 inactive connections
2019-12-19 18:07:53.517 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 ! R:xxx.xxx.xxx.xxx:443] onStateChange(POST{uri=/oauth/token, connection=PooledConnection{channel=[id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 ! R:xxx.xxx.xxx.xxx:443]}}, [disconnecting])
2019-12-19 18:07:53.518 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 ! R:xxx.xxx.xxx.xxx:443] Releasing channel
2019-12-19 18:07:53.523 DEBUG 7 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider   : [id: 0x2a4d88b0, L:/xxx.xxx.xxx.xxx:38628 ! R:xxx.xxx.xxx.xxx:443] Channel cleaned, now 1 active connections and 0 inactive connections
2019-12-19 18:07:53.532 DEBUG 7 --- [reactor-http-epoll-4] r.netty.http.client.HttpClientConnect    : [id: 0x9bf92ed9, L:/xxx.xxx.xxx.xxx:38600 - R:xxx.xxx.xxx.xxx:443] The connection observed an error, the request will be retried

io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer

2019-12-19 18:07:53.543 ERROR 7 --- [reactor-http-epoll-4] o.s.w.s.adapter.HttpWebHandlerAdapter    : [177851e0] 500 Server Error for HTTP GET "/login/oauth2/code/myapp?code=Jo87
qf1XIC&state=yMYngnlcVyuy8BgpgFGxZDwo0nmwYg7pa4mcvROTtrU%3D"

io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer
        Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
        |_ checkpoint  Request to POST https://server-name/oauth/token [DefaultWebClient]
        |_ checkpoint  org.springframework.security.oauth2.client.web.server.authentication.OAuth2LoginAuthenticationWebFilter [DefaultWebFilterChain]

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:32 (16 by maintainers)

github_iconTop GitHub Comments

4reactions
violetaggcommented, Jan 20, 2020

We are going to expose a configuration property for switching pool’s lease strategy from FIFO to LIFO (FIFO is by default) #962

Using LIFO leasing strategy + max idle timeout will give you the behaviour that you need

  • When the connection is acquired it will be the most recently used
    • If max idle timeout is reached this means that this connection will be closed and as this connection was the most recently used this means that all the rest (those that are not active) in the pool will also be closed because of the max idle timeout. A new connection will be created and used for the request.
    • If the connection is closed by the remote peer between acquire and the actual usage - Connection reset by peer will be received and we will retry the request. As this connection was the most recently used and it was closed by the remote peer this mean all the rest (those that are not active) in the pool also will be closed thus again a new connection will be used for the second attempt.
1reaction
violetaggcommented, Oct 24, 2022

@sgc109 Connection doesn’t go out of the connection pool if max idle time is reached or this is what happens

or it keeps trying to acquire connections until it finds a connection less than ‘maxIdleTime’ passed from lastly used time or there is no connection left in the pool(create new connection)

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to solve "springcloud gateway connection reset by peer"?
I run Jmeter concurrency(concurrency=5) test on springboot(version is 2.6.6) application and the springcloud gateway version is 3.1.2.
Read more >
reactor/reactor-netty - Gitter
in the pool we have only connections that are alive and can be used for requests ... of WebClient keeps throwing "Connection Reset...
Read more >
Connection reset by peer io.netty.channel.unix.Errors ...
Coding example for the question Connection reset by peer io.netty.channel.unix.Errors$NativeIoException-Springboot.
Read more >
Reactor Netty Reference Guide
Default: 45 seconds. If you need to disable the connection pool, you can apply the following configuration: ./../ ...
Read more >
Spring Cloud configuration properties
eureka.server.peer-eureka-status-refresh-time-interval-ms ... Pool connection re-use policies. ... spring.cloud.gateway.httpclient.pool.max-connections.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found