I have and issue in my AWS system. Every few requests takes almost exactly 1 minute and 30 seconds to answer. When I say a few I mean 5 to 25 or so. Normally if you cancel the slow request and send again it just answers fast. I also noticed this happens with ANY request, not only specific ones. The servers and back-end do not look overloaded. the system is as follows:
ALB with sticky sessions | 2 Web servers | DB on RDS
The system when using curl most times responds fine, but when it takes long, this is the response output:
time_namelookup: 0.004136 time_connect: 130.117558 time_appconnect: 130.125254 time_pretransfer: 130.125340 time_redirect: 0.000000 time_starttransfer: 130.172553 ---------- time_total: 130.172615
Aside from the
time_connect, the request is fine in the sense that the page loads after that. normal response time of the system is under 0.5 seconds.
I was reading about this and the docs indicate
time_connect, is related to
"time_connect is the TCP three-way handshake from the client’s perspective. It ends just after the client sends the ACK - it doesn't include the time taken for that ACK to reach the server. It should be close to the round-trip time (RTT) to the server. In this example, RTT looks to be about 200 ms."
This was taken from here.
I can not find anything meaningful on AWS Cloudwatch, the app logs or the DB monitoring. Any ideas about what I should look into or how to troubleshoot this issue?