Cache

Posted on May 23, 2026 by [bLueriVerLHR]

General

All cache replacement policies can be represented by following operations:

  1. Insertion: Where to insert a new cache line when it is decided to be kept. (How to decide also metters)
  2. Promotion: How to update the status of cache line after a cache hit.
  3. Aging: How to update the status of cache line after a cache miss.
  4. Eviction: Which cache line should be removed to make space for a new one.

While implementing caches, I noticed that some state-of-the-art (SOTA) policies introduce additional operations related to global history or custom metrics. Basically, the four basic operations are used for better understanding the policies. When actual implementing a cache, the only two interfaces are required: insert and fetch.

Depending on the status recorded for each cache line — such as recency, frequency, historical state, and so on — many variants are possible.

NameInsertionPromotionAgingEvictionStatus
Least Recent Used (LRU)MRUMRUMove 1 position towards LRULRURecency
Most Recent Used (MRU)MRUMRUMove 1 position towards LRUMRU1Recency
Early Eviction LRU (EELRU)MRUMRUMove 1 position towards LRUMRU or e-th recently used line 2Recency
Segmented LRU (Seg-LRU/SLRU)3MRU in probationary segmentMRU in protected segmentMove 1 position towards LRULRU from probationary segmentRecency
Bimodal Insertion Policy (BIP)MRU with probability $\epsilon$4MRUMove 1 position towards LRULRURecency
Static Re-Reference Interval Prediction (SRRIP)RRPV=2RRPV=0Increment all RRPVs (if no line with RRPV=3)RRPV=3RRPV
Bimodal re-reference interval prediction policy (BRRIP)Default RRPV=3, RRPV=2 with probability $\epsilon$RRPV=0Increment all RRPVs (if no line with RRPV=3)RRPV=3RRPV
Protecting Distance-Based Policy (PDP)5RPD=PDRPD=PDDecrease RPDRPD=0RPD
Genetic Insertion and Promotion for PseudoLRU Replacement (GIPPR)6IPV[k]IPV[i] for i-th statei+1 for i-th statek-1IPV
Shepherd Cache (SC)7Recency
Least Frequent Used (LFU)Frequency=0Increase Frequencyn/aLFUFrequency
Frequency-Based Replacement (FBR)MRU, Frequency = 0MRU, Frequency++ if not in new sectionMove 1 position towards LRULFU line in old sectionRecency, Frequency
Least Recently/Frequently Used (LRFU)8$CRF(b)=F(0)$, $LAST(b)=t_c$$CRF(b)=F(0)+F(t_c–LAST(b))*CRF_{last}(b)$, $LAST(b)=t_c$$t_c=t_c+1$Line with min CRF valueCRF

Least Recent Used

The policy evicts least recent used cache lines. It face two main challenges: thrashing and scanning.

  1. The thrashing workloads refer to the size of working set exceeds the cache size. For example, loop on arr[N] again and again with cache[M] where N > M.
  2. The scanning workloads refer to a burst of references of data that is not reused. For example, loop on arr[N] only once.

The variants of Recency-Based policies can be realized by modifying the insertion policy, while keeping the eviction policy unchanged (evict the line in the LRU position).

Static Re-Reference Interval Prediction

The policy models the cache replacement problem into a Re-Reference Interval Prediction (RRIP) problem. It uses Re-Reference Prediction Value (RRPV) to predict the re-reference interval of each cache line. Usually, the RRPV is implemented with a 2-bit value. The update of RRPV can be written into a FSM with 4 states.

The initial state of each cache line at insertion can change the feature of the policy:

  • 0 indicates Recency Friendly
  • 2 indicates Scan Resistant
  • 3 indicates Thrash Resistant

  1. The purpose of MRU is to solve the thrashing. So it keeps the LRU lines for further fetching. ↩︎

  2. The EELRU will make the decision via monitoring the distribution of hit. If the distribution of hits is monotonically decreasing, evict LRU lines. If the distribution of hits is higher in the near LRU region than the early region (e-th region) in a roughly cyclic pattern that is larger than main memory, evict e-th LRU lines. ↩︎

  3. The SLRU use two set of cache lines: probationary and protected. If a cache line in protected segment is evicted, it will be moved to MRU in probationary segment. If a cache line in probationary segment is evicted, it will be evicted from the cache. ↩︎

  4. Otherwise, insert new cache line in LRU. ↩︎

  5. A generalization of RRIP. The policy assign protecting distance (PD) to new cache lines to protect them not being evicted within the remaining PD (RPD). The PD is predicted via reuse distance distribution (RDD) which can be computed offline for different workloads. At runtime, the PD is recomputed infrequently using a small special-purpose processor. ↩︎

  6. The policy use the concept of an Insertion/Promotion Vector (IPV) for insertion and promotion. In particular, the IPV will be a vector of k+1 length for a k-way set-associative cache. All its values are within [0, k-1]. The IPV specifies a block’s new position in the recency stack both when it is inserted into the cache and when it is promoted from a different position in the recency stack. ↩︎

  7. The policy try to emulate future lookahead in Belady’s policy with the help of an auxiliary cache, called the Shepherd Cache. In particular, the cache is logically divided into two components: (1) the Main Cache (MC) which emulates optimal replacement, and (2) the SC which uses a simple FIFO replacement policy. New lines are initially buffered in the SC, and the decision to replace a candidate from the MC is deferred until the new line leaves the SC. While the new line is in the SC, information is gathered about the reuse order of replacement candidates in the MC. For example, candidates that are reused earlier become less likely candidates for eviction since Belady’s policy evicts the lines that is reused furthest in the future. When the new line is evicted from the SC (due to other insertions in the SC), a replacement candidate is chosen by either picking a candidate from the MC that hasn’t been reused within the lookahead window, or the candidate that was reused last; if all lines in the MC were reused before the SC line was reused, then the SC line replaces itself. ↩︎

  8. The policy use a new metric called Combined Recency and Frequency (CRF). It weighs the relative contribution of each reference by a weighing function. In particular, LRFU computes for each block a CRF value, which is the sum of the weighing function $F(x)$ for each past reference, where $x$ is the distance of the past reference from the current time. The weight function is $F(x)=(\frac{1}{p})^{\lambda x}$, where $\lambda$ is an empirically chosen parameter. ↩︎