背景
这是我校招刚入职 Shopee 时遇到的一个问题。Shopee 私有云上 WAF 给内部用户提供了设置 IP 黑白名单规则的能力,所有规则存储在 MySQL 中。我校招刚入职时从已离职前辈的手中接过了这套系统。但很快发现每次修改规则后的 5min 内读到的数据不稳定——新规则时而查得到,时而查不到,也经常有用户反馈这个问题。排查发现原因是服务代码中使用了内存缓存,而这个服务部署了两个实例,实例之间没有同步写请求。如果写后读的读写请求被路由到不同的实例上,就无法读到最新数据。而内存缓存的过期时间被设置为 5min。
查了下这个服务的运维记录,在我入职之前做过一次扩容,从单实例扩容到双实例。之前的研发同事维护 WAF 时一直是单实例运行,所以没出过问题。后来他离职了,别的同事扩容时可能也没意识到会造成不一致的问题。于是问题就到了我这儿。
引入 Redis
我首先想到的解决办法是把内存缓存换成了 Redis,但上线灰度阶段 Redis 带宽被打满,排查发现是因为有些规则的封禁 IP 列表很长,导致传输数据量非常大。
最终方案
由于 WAF 规则读多写少,绝大多数时候从 Redis 读到的数据不会有变化。有经验的老同事建议用 Redis 维护版本号,规则数据仍然存在内存缓存中。经过反复推敲,最终的设计的架构如下。
读写逻辑:
- 写操作比较简单,使用当前微秒时间戳作为新的版本号,做如下四件事:写 DB,更新 redis 版本号,更新本地内存缓存中的数据和版本号,四件事的顺序可交换
- 读操作稍微复杂一点:先读 redis 中的版本号,如果本地版本号没有过期(绝大多数情况)就直接从本地内存缓存中读数据。对于 redis 与内存中版本号不一致和 redis 没读到(expired)的情况要单独处理,处理逻辑如伪代码所示
- 如果一微秒内有多个写请求,仍然可能出现不一致。不过 Shopee WAF 的实际使用场景不太会有如此频繁的更新,所以我就没做处理了。不过时间戳在这里只用来判等,不会比较大小,因此可以用任何一种分布式唯一 ID 解决方案替换时间戳
- 版本号不对用户暴露,事实上同一版本号可能会读到不同的规则数据,但这并不会破坏最终一致性
func Set(key, data) { newVer := time() localCacheVer.Set(newVer) localCacheData.Set(data) WriteMySQL(key, data) redis.Set(key, newVer, exprire=5min) } func Read(key) Data { ver := redis.Get(key) if ver != nil { if localCacheVer.Load() == ver { // Local cache is up-to-date, just use it return localCacheData.Load() } } else { // This version has expired ver := time() res := redis.SetNX(key, ver, expire=5min) if res == false { // Another instance has proceded, use that version ver = redis.Get(key) } } data := ReadFromMySQL(key) localCacheVer.Store(ver) localCacheData.Store(data) return data }
TLA+ 形式化验证
恰好当时自学了 TLA+,顺手写了下这个设计对应的 TLA+ 公式,果然成功通过了最终一致性的验证。写这篇总结的时候感觉应该是线性一致的,但没有验证。
最开始的持续 5min 的接口返回数据不一致问题成功得到了解决。
// ================ tla file ================ ---- MODULE waf ---- EXTENDS Integers, TLC VARIABLE redisVer, localVer, pc, threadVer, DBData, localData, threadData CONSTANTS DataDomain, ProcSet, r1, r2, r3, t1, t2, t3 vars == << redisVer, localVer, pc, threadVer, localData, threadData, DBData>> Init == /\ redisVer = -1 /\ localVer = -1 /\ localData = "" /\ DBData = "" /\ threadVer = [self \in ProcSet |-> -1] /\ pc = [self \in ProcSet |-> "A"] /\ threadData = [self \in ProcSet |-> ""] RedisExpire == /\ threadData = [self \in ProcSet |-> DBData] /\ redisVer' = -1 /\ DBData' \in DataDomain /\ UNCHANGED <<localVer, threadVer, localData, threadData, pc>> ReadRedis(self) == /\ pc[self] = "A" /\ threadVer' = [threadVer EXCEPT ![self] = redisVer] /\ / /\ redisVer = -1 /\ pc' = [pc EXCEPT ![self] = "C"] / /\ redisVer # -1 /\ pc' = [pc EXCEPT ![self] = "F"] /\ UNCHANGED <<localVer, redisVer, localData, threadData, DBData>> SetRedis(self) == /\ pc[self] = "C" /\ / /\ redisVer # -1 * SetNX failed => use existing redis /\ redisVer' = redisVer /\ threadVer' = [threadVer EXCEPT ![self] = redisVer] * Not strictly the same! / /\ redisVer = -1 * SetNX ok => change redis /\ redisVer' \in 1600012345..1600012350 /\ threadVer' = [threadVer EXCEPT ![self] = redisVer'] /\ pc' = [pc EXCEPT ![self] = "I"] /\ UNCHANGED <<localVer, localData, threadData, DBData>> CheckLocal(self) == /\ pc[self] = "F" /\ / /\ localVer = threadVer[self] * Normal case /\ threadData' = [threadData EXCEPT ![self] = localData] /\ pc' = [pc EXCEPT ![self] = "H"] / /\ localVer # threadVer[self] /\ pc' = [pc EXCEPT ![self] = "I"] /\ threadData' = threadData /\ UNCHANGED <<redisVer, localVer, localData, threadVer, DBData>> SetLocal(self) == /\ pc[self] = "I" /\ localVer' = threadVer[self] /\ localData' = DBData /\ threadData' = [threadData EXCEPT ![self] = DBData] /\ pc' = [pc EXCEPT ![self] = "H"] /\ UNCHANGED <<redisVer, threadVer, DBData>> ReturnResult(self) == /\ pc[self] = "H" /\ pc' = [pc EXCEPT ![self] = "Done"] /\ UNCHANGED <<redisVer, localVer, threadVer, localData, threadData, DBData>> Again(self) == /\ pc[self] = "Done" /\ pc' = [pc EXCEPT ![self] = "A"] /\ UNCHANGED <<redisVer, localVer, threadVer, localData, threadData, DBData>> Terminating == /\ \A self \in ProcSet: pc[self] = "Done" /\ UNCHANGED vars Proceed(t) == ReadRedis(t) / SetRedis(t) / CheckLocal(t) / SetLocal(t) / ReturnResult(t) / Again(t) Next == / RedisExpire / \E t \in ProcSet: Proceed(t) FairForEveryone == \A t \in ProcSet: SF_vars(Proceed(t)) Spec == /\ Init /\ [][Next]_vars /\ FairForEveryone symm == Permutations({r1, r2, r3}) \union Permutations({t1, t2, t3}) EventualCons == \A v \in DataDomain: DBData = v ~> threadData = [t \in ProcSet |-> v] ECSpec == Spec /\ EventualCons // ======= cfg file ======== SPECIFICATION Spec CONSTANTS DataDomain = {r1, r2} r1 = r1 r2 = r2 r3 = r3 ProcSet = {t1, t2, t3} t1 = t1 t2 = t2 t3 = t3 SYMMETRY symm PROPERTIES EventualCons
到此这篇关于Redis 解决缓存一致性问题的文章就介绍到这了,更多相关Redis 缓存一致性内容请搜索好代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持好代码网!