Problem#
-
Recently, while reviewing legacy project code, I found some operations on
Redis
that did not set expiration times. I suspect this might be because theHash
structure cannot set expiration times simultaneously during use, leading to forgetting to set expiration times later. -
The commonly used structure in the code is:
key
is related to the advertisement identifier,field
is the user identifier, andvalue
is custom data, with two usage patterns causing excessive memory usage.- The first pattern is appending
date
to thekey
as aHash
structure.HSET xxx:ad_id:20220315 userid value
- The second pattern is that the advertisement has been taken offline, but the corresponding
key
still remains in the database.HSET xxx:ad_id userid value
- The first pattern is appending
-
Before the operation, the database information was as follows: the configuration is
8*8G
, but memory usage has exceeded51G+
. -
Theoretically, this service should not use so much memory. Due to the legacy code not setting expiration times, operations can only keep increasing memory to keep the service running normally.
Optimization#
Querying the Distribution of keys
#
- You can check large
keys
through Alibaba Cloud's offline analysis platform https://help.aliyun.com/document_detail/102093.htm?spm=a2c4g.11186623.0.0.58e12542DRK90l#concept-ufz-byl-jgb. - If you are not using Alibaba Cloud services, the solution to query large
keys
(if it is during a low-traffic period, you can omit the-i
parameter):redis-cli --bigkeys -i 0.1
Refactoring Code#
- By analyzing the distribution of
keys
, I then used fuzzy search in the code to locate the relevant code operating onRedis
.- The relevant code sets expiration times.
TTL xxx:ad_id:20220315 86400
- The advertisement offline event deletes expired
keys
.UNLINK xxx:ad_id
- The relevant code sets expiration times.
Deleting Legacy Data#
- In the past, deleting large
keys
directly withdel
would block the main thread, so a script was written toScan
and then delete, which would not block. - If the version is greater than
4.0
, you can directly useunlink xxx:ad_id:20220315
. - To delete in bulk, execute:
redis-cli --scan --pattern xxx:ad_id:* | xargs redis-cli unlink
- Note: After
4.0
,Redis
will also perform this asynchronous deletion operation for automatically deleting expiredkeys
.
After Deleting Data, used_memory_rss
Does Not Release#
used_memory_rss
is the space requested by the program, whileused_memory
is the actual memory used by the program.- After deleting a large amount of data, it is likely to see that the value of
used_memory_rss
is much greater thanused_memory
, indicating that there are many memory fragments. mem_fragmentation_ratio = used_memory_rss / used_memory
- If
mem_fragmentation_ratio
is greater than1.5
, this memory fragmentation needs to be addressed.- Synchronous processing method
memory purge
(not recommended). - Asynchronous processing method:
- Enable automatic memory fragmentation cleanup.
config set activedefrag yes
- Check if automatic memory fragmentation cleanup is enabled.
config get activedefrag
1) "activedefrag"
2) "no"
- Set a certain threshold (you must set a certain option for
Redis
to start processing).config set active-defrag-threshold-lower 10
- Enable automatic memory fragmentation cleanup.
- Synchronous processing method
# Enable automatic memory fragmentation cleanup (master switch)
activedefrag yes
# When fragmentation reaches 100mb, enable memory fragmentation cleanup
active-defrag-ignore-bytes 100mb
# When fragmentation exceeds 10%, enable memory fragmentation cleanup
active-defrag-threshold-lower 10
# If memory fragmentation exceeds 100%, make every effort to clean up
active-defrag-threshold-upper 100
# Minimum percentage of resources occupied by automatic memory cleanup
active-defrag-cycle-min 25
# Maximum percentage of resources occupied by automatic memory cleanup
active-defrag-cycle-max 75
- After a morning of processing, it can be seen that
used_memory_rss
has started to decline, but looking at this progress, it will still take a long time to release memory.
Notes#
You can manually back up data before deleting#
- Check the timestamp of the last backup.
LASTSAVE
- Start the backup.
BGSAVE
- Then continue running
LASTSAVE
to check the timestamp. If it is greater than the last backup timestamp, it indicates that this backup has been successful, and deletion can proceed.
Expiration Handling#
Redis keys expire in two ways: passively and actively.
Keys expire passively only when some clients attempt to access the key and find that it has timed out.
Of course, this is not enough, as expired keys will never be accessed again. These keys should expire anyway, so Redis periodically tests a few random keys among the keys with set expiration. All expired keys will be removed from the keyspace.
Specifically, this is an operation that Redis performs 10 times per second:
Test 20 random keys from the set of keys with associated expiration times.
Delete all keys found to be expired.
If more than 25% of the keys are expired, restart from step 1.
This is a simple probabilistic algorithm that basically assumes our sample represents the entire keyspace, and we continue to expire until the percentage of potentially expired keys falls below 25%.
This means that at any given moment, the maximum number of expired keys using memory equals the maximum write operations per second divided by 4.
- For example, the default cache driver used by
Laravel
isFile
, which uses the passive method for deletion, and it also has theThrottleRequests/throttle
middleware enabled by default. If not careful, this can lead to a significant issue, with a large number of file writes but no deletions.