LRU-K Buffer Eviction Algorithm

We know that when it comes to performance in any relational database, memory is king, that is you want to be able to fit as much as your working set into memory because no matter how fast your disks are they will never come close to the access speed of memory.

Ok then, given the above paragraph, how we evict pages from memory becomes extremely important for a performance viewpoint.

Today we will discuss the LRU-K algorithm which is used by MS SQL Server to manage what pages it evicts from the buffer pool, to be more precise SQL Server uses LRU-2.

Let’s first look at the LRU (Least Recently Used) algorithm and why this is not used.

  1. It does not take in account how frequently a page is accessed, caring only about when it was last accessed. Therefore, potentially, pages that are frequently accessed could end up getting kicked out of memory.
  2. It also leaves memory susceptible to what is called ‘Table Scan Flooding’, i.e. where a table scan is carried out on a large table that might be infrequently accessed, using the LRU algorithm, this would flood memory with these infrequently accessed pages and kicking out frequently accessed pages. Considering that more and more database environments are heading down the hybrid approach where the database is used for both OLTP and also Reporting this would be a big problem, one Report run by a user during the day could have a very detrimental impact on the overall system.
  3. No thought is taken of the type of page that is being evicted either. For example, an Index page might be much more useful to be kept in memory as it is much more likely to bit hit compared to a data page.

Right then, we have seen why the LRU algorithm is not a good choice for the management of page eviction from memory for relational databases, so let’s look at what SQL Server actually uses, the LRU-K, where k >= 2 algorithm.

Essentially what this means is that for each page we keep track of the last two times that it was accessed. This information is stored in the page header. From these an ‘Interarrival Time’ is calculated which essentially is the time between the two references and only pages with the shortest interval time are kept in memory.

Another point to note on this is what is meant by a reference. So, there are different types of references, for example a page that is read in from disk and referenced in a transaction, should that page be evicted before the transaction is completed, the chances might be high that the same page may be referenced again within that transaction and so we would only have to read it in again. If it is referenced twice within the same transaction, does that mean it now is considered a very popular page as it has two references very close together, but again it’s possible that once the transaction completes that page may never be referenced again.

My point here is that the LRU-K algorithm must take all of this into account and look at references inside a transaction, transaction retries, etc.

If you are interested in reading in more detail on the LRU-K algorithm, check out this paper that was written on this: LRU-K Algorithm

Until next time, Slán!

DisclaimerNever just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

Published by sqlrebel

Data Engineer.

Leave a comment