If the master uses diskless synchronization and the replica uses diskless load, this means no RDB file persists on disk, the compression of RDB costs much CPU and time both on master and replica, and network is much fast, not a bottleneck. So maybe we can disable RDB compression to reduce full sync time.
I tested on AWS, there are two EC2, master enabled diskless sync, replica enabled diskless load.
I used memtier-benchmark to generate 21GB data, all values are repeated x, this means all data is very easy to compress, the RDB file is about 635MB (enabling compression by default).
When we disabled RDB compression, the full sync costed 36s, but it costed 45s when enabling RDB compression.
then I used redis-benchmark to generate 24GB data, the value is a random string, so the data is very hard to compress, the RDB is about 20GB even enabling compression.
When we disabled RDB compression, the full sync costed 44s, but it costed 75s when enabling RDB compression.
Therefore, regardless of whether the data is suitable for compression, disabling compression can always reduce the time of full synchronization.
since rdbcompression configuration is designed for RDB file, it seems not a good idea to disable it directly, maybe the replica can ask the master to deliver a RDB without compression by replconf rdb-no-compress 1 command, when the replica can use diskless load.
Comment From: tezc
If we place the master and replica on the same machine, no-compression method becomes 3-4 times faster. It suggests that bandwidth is the main bottleneck now. The above tests show %20-40 gains but I suspect we might see even better numbers on networks with higher bandwidth between nodes.
The downside is that disabling compression increases network traffic but as it gives 20–40% boost (and potentially more under better conditions), seems like this is good trade-off.
@oranagra wdyt?
Comment From: oranagra
i agree. i wonder if it should depend on diskless-load (which isn't really active normally). or maybe it's beneficial even with disk-based. if it isn't beneficial in disk-based, the replica will have to send that tip depending on whether or not it can use diskless load (complicated). another thing to consider is master-slave replication over slower networks, some users may be using replication between different data centers, and i suspect in these cases compression is handy. in that case maybe we need to let the user control it, or we can argue that in these cases it's better to use disk based master anyway (since slow replication can extend the time of the fork)
Comment From: ShooterIT
maybe it's beneficial even with disk-based.
it is not easy to get a conclusion, this may be related to data compression ratio. we can get benefits with disk-based if it is hard to compress. But what i am concerned is to avoid breaking the rdbcompression config, if users enable this config, but we generate a RDB without compression, that may make confusing.
for slower network, i am not sure, maybe a hidden config can be introduced? users can disable this feature if they insist
Comment From: ShooterIT
diskless-load (which isn't really active normally)
yep, diskless-load is great feature to me, dramatically reduces full sync time, but it is not used widely. on-empty-db only works at the first full sync generally, swapdb needs double of maxmemory. I was thinking we can introduce a new option: flush-db, the replica drops the current db data async, then diskless load DB for socket, maybe it brings risk to lose data, but it provides another choice to reduce full sync time. WDYT? @oranagra @tezc
Comment From: oranagra
we can change rdbcompression from yes/no to an enum with yes/no/disk-repl. maybe that's better than creating a new config that overrides it on diskless. on the other hand, it would probably mean we wanna change the default value of that config, and that's complicated.
another issue, is that it could be someone is creating a backup using something like redis-cli --rdb, which uses diskless and saves a file, so in that case maybe they'd like it compressed, though on the other hand, maybe they'd better use a separate compression tool on the file post creation.
Comment From: ShooterIT
But even if rdbcompression is disk-repl on the master, we still need replica not to persist this RDB file, since this may break the config on replica side. So I think we only generate a snapshot without compression only when it exists on the network and not on files. And we don't break the behavior of redis-cli --rdb.
Comment From: oranagra
i don't think i understand your last message. but anyway, what do you suggest to do?
Comment From: ShooterIT
I mean even if the master allows RDB not to compress for replication, maybe we still should not to generate a RDB without compression directly, the replica may persist RDB in file system but rdbcompression is yes on replica.
So i think this feature is enable only when the RDB snapshot is transient (never in file system)
after second thought, I think we don't need to introduce new configuration and options. rdbcompression disk-repl seems not clear. We enable this feature only when diskless-repl and diskless-load are enable. And for slow networks, we should not enable diskless replication, right?
Comment From: oranagra
yes, that all seems right. so we'll add a new replconf to for the slave to tip the master, but we won't expose any of that to users..