Observation
In the error log of one of the service I managed, following error occurred occasionally under heavy workload:
redis.exceptions.ConnectionError: Connection closed by server.
Analysis
1. Reproduce the problem
I ran the service locally on my laptop and connect it with Redis which was running locally to reproduce the problem. I also tried to simply the code and successfully reproduce the problem with a demo code. The full error is attached below:
Traceback (most recent call last):
File "close_connection_error.py", line 103, in comb_work
await io_work()
File "close_connection_error.py", line 90, in io_work
data = await node.get('foo')
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/client.py", line 514, in execute_command
return await conn.retry.call_with_retry(
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/retry.py", line 62, in call_with_retry
await fail(error)
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/client.py", line 501, in _disconnect_raise
raise error
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
return await do()
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/client.py", line 488, in _send_command_parse_response
return await self.parse_response(conn, command_name, **options)
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/client.py", line 535, in parse_response
response = await connection.read_response()
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/connection.py", line 832, in read_response
response = await self._parser.read_response(
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/connection.py", line 256, in read_response
response = await self._read_response(disable_decoding=disable_decoding)
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/connection.py", line 298, in _read_response
response = await self._read(length)
File "/Users/temp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/redis/asyncio/connection.py", line 325, in _read
raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR) from error
2. Package Capture
Although I can reproduce the error, there’s still a lot of reasons for losing a connection. So I decided to capture the network package using tcpdump and analyzing it with Wireshark. As we can see, there’re a lot of FIN in the middle, usually after requesting for some specific keys. After analyzing, these keys are usually big keys.

3. To make things worse
Originally the size of big keys which cause the problem was about 5MB. To make things worse, I intentionally made them X20 larger and the problem can be serious and It happened frequently. Thus, the big keys are the root cause for redis server to close connection.
4. Why does redis server close connection?
For common reason why redis server close a connection proactively, please refer to official doc: Output Buffer Limits, Query Buffer Hard Limit and Client Eviction.
In my case, the reason is that it hits the client eviction rule even the query buffer hard limit in the extreme case (sending 500MB keys)
Learned
- Avoid big keys. No reason to store so much data in a cache which is expected to be fetched frequently.