Review V: Principles of virtual memory

V1: [1] [2] [3] [4] // V2: [1] [2] [3] [4] [5] [6] // V3: [1] [2] [3] // V4: [1] [2]

Problem V1.1

a.	Explain how a CPU supporting virtual memory would translate a virtual memory address into the actual memory address, assuming that the requested page is in memory.
b.	Explain what the CPU does when the requested page is not in memory.

a.	It splits the virtual memory address into two parts, a page index and an offset. The CPU looks into the page table, stored in RAM, to get the page table entry for the specified page index. This page table entry tells which page frame contains the page, and the CPU looks into this page frame at the offset to access the requested data.
b.	It raises a page fault, which transfers control to the operating system's exception handler. The OS will then presumably load the requested page into some page frame and return from the exception handler. On return, the CPU will try the memory access again.

Problem V1.2

When a program accesses a virtual memory address, each of the following could possibly occur. For each, write “CPU” if it would be performed by the CPU while executing the instruction, and write “OS” if it would be performed by the operating system's interrupt handler for page faults.

a. Translating the virtual memory address to a DRAM address.

b. Marking a page as recently referenced.

c. For a page that isn't in DRAM, determining which existing page in DRAM should be removed to make room for it.

d. Loading a page off of disk into DRAM.

e. Updating the page table to reflect that the loaded page is now in DRAM.

f. If the program's instruction is a store, marking its page table entry to reflect that the page is dirty.

The CPU handles (a.), (b.), and (f.). The OS handles (c.), (d.), and (e.).

Problem V1.3

Describe at least three elements of a typical page table entry.

Here are four:

1.	The valid bit, saying whether the page is currently in memory.
2.	The location of the page in memory.
3.	The dirty bit, saying whether the page in memory has been changed.
4.	The referenced bit, saying whether the page in memory has been accessed recently.

Problem V1.4

Describe the purpose of the dirty bit found in each page table entry in most virtual memory systems. How does the use of the dirty bit help the system's performance?

The OS resets the dirty bit to zero each time it loads a page into RAM, and the CPU sets the dirty bit to one each time an instruction modifies the information in the page. As a result, the dirty bit keeps track of whether the page has changed since it was loaded into RAM.

If there were no dirty bit, then when the OS decided to eject the page from RAM, it would have to assume that the page were changed, and so it would have to save the page onto disk before the RAM becomes available for other purposes. The time to save a page onto disk increases the expense of ejecting a page significantly. With the dirty bit, the OS can avoid this expense when possible.

Problem V2.1

Recall that most computing systems use some variant of the Clock page replacement algorithm for virtual memory, and that the Clock algorithm is meant to approximate the LRU (Least Recently Used) algorithm.

a. Describe the memory usage patterns of real-world processes that lead LRU to be a good page replacement algorithm.

b. Explain why computing systems rarely use LRU for virtual memory.

a. Processes tend to use a small set of pages at any time, though the set of frequently-used pages migrates over time. This is served well by LRU, since the set of frequently-used pages tend always to be recently used, whereas pages that were formerly used heavily but are no longer eventually fall out of favor by LRU.

b. LRU requires that the page table be updated with every memory access to reflect that the page should now be remembered as the most recently used. This update is an additional memory access, and we would like to avoid memory accesses for efficiency's sake.

Problem V2.2

Suppose we were using a virtual memory system with three page frames, using the Clock algorithm as our page replacement policy. For the sequence of page accesses 1-2-3-4-2-5-6, identify the contents of the three page frames after each memory access.

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3

[The quote marks and period just illustrate the algorithm's process; they are not requested in the question.]

page	frame
ref	1	2	3
1	1'.
2	1'	2'.
3	1'	2'	3'.
4	4'.	2	3
2	4'	2'	3
5	4'	2	5'.
6	4	6'.	5'

Problem V2.3

Suppose we were using a virtual memory system with three page frames, and our program accesses the pages in the order 1–2–3–1–4–2–5–3. For each of the algorithms FIFO, LRU, and CLOCK, identify the contents of the three page frames after each memory access.

FIFO

LRU

CLOCK

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3

FIFO

LRU

CLOCK

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3
1	1	2	3
4	4	2	3
2	4	2	3
5	4	5	3
3	4	5	3

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3
1	1	2	3
4	1	4	3
2	1	4	2
5	5	4	2
3	5	3	2

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3
1	1	2	3
4	4	2	3
2	4	2	3
5	4	2	5
3	4	3	5

Problem V2.4

Suppose we were using a virtual memory system with three page frames, using the Clock algorithm as our page replacement policy. For the sequence of page accesses 1-2-3-4-2-5-2-6-3, identify the contents of the three page frames after each memory access.

page	frame
ref	1	2	3
1	1
2	1	2
3	1	2	3

[The quote marks and period just illustrate the algorithm's process; they are not requested in the question.]

page	frame
ref	1	2	3
1	1'.
2	1'	2'.
3	1'	2'	3'.
4	4'.	2	3
2	4'.	2'	3
5	4'	2	5'.
2	4'	2'	5'.
6	6'.	2	5
3	6'	3'.	5

Problem V2.5

Suppose our virtual memory system has four page frames and uses the Clock algorithm as its page replacement policy. For the sequence of page accesses 1–2–3–4–1–5–3–6–7–6–8, identify the contents of the four page frames after each memory access.

page	frame
ref	1	2	3	4
1	1
2	1	2
3	1	2	3
4	1	2	3	4

[The quote marks and period just illustrate the algorithm's process; they are not requested in the question.]

page	frame
ref	1	2	3	4
1	1'.
2	1'	2'.
3	1'	2'	3'.
4	1'	2'	3'	4'.
1	1'	2'	3'	4'.
5	5'.	2	3	4
3	5'.	2	3'	4
6	5'	6'.	3'	4
7	5'	6'	3	7'.
6	5'	6'	3	7'.
8	5	6	8'.	7'

Problem V2.6

Suppose our virtual memory system has four page frames and uses the Clock algorithm as its page replacement policy. For the sequence of page accesses 1–2–3–4–2–5–3–1–6–7 identify the contents of the four page frames after each memory access.

page	frame
ref	1	2	3	4
1	1
2	1	2
3	1	2	3
4	1	2	3	4

page	frame
ref	1	2	3	4
1	1'.
2	1'	2'.
3	1'	2'	3'.
4	1'	2'	3'	4'.
2	1'	2'	3'	4'.
5	5'.	2	3	4
3	5'.	2	3'	4
1	5'	1'.	3'	4
6	5'	1'	3	6'.
7	5	1	7'.	6'

Problem V3.1

Explain what a translation lookaside buffer (TLB) is, and explain why it is important for reducing the amount of time for accessing memory.

A straightforward implementation of virtual memory places the page table in memory. Unfortunately, this means that each attempt to access a memory location actually requires two memory accesses — one to look up the page table entry in virtual memory, and one to access the requested memory within the page frame given by that entry. This effectively halves the time to access each page, when compared to a system that does not use virtual memory.

Designers reduce this problem dramatically by caching a small number of frequently-used page table entries on the CPU chip, in a portion of the chip called the translation lookaside buffer, or TLB. Since accessing data stored on the CPU chip is much faster than accessing data in memory, the TLB can dramatically reduce the time for looking up page table entries, restoring a access speed more comparable to a system without virtual memory.

Problem V3.2

Suppose we have a system with a TLB and a single level of cache for regular data. The TLB miss rate is 1%, while the cache miss rate is 5%. (Assume that these probabilities are independent of one another.) Accessing the TLB or cache takes one clock cycle, whereas accessing DRAM takes 30 clock cycles.

a. What will be the average time to access a memory value addressed using virtual memory?

b. If there were no TLB at all, what would be the average time to access memory?

a. (1 + 0.01 * 30) + (1 + 0.05 * 30) = 3.8 cycles.

b. 30 + (1 + 0.05 * 30) = 32.5 cycles.

Problem V3.3

Explain the concept and purpose of page directories.

A page directory is a table of pointers to page tables for regions of virtual memory. A CPU uses the page directory instead of a single page table, because a single page table can become very large and it must be in memory in order to allow for address translation. With a page directory, the partial page tables can be paged in virtual memory. (The page directory must occupy RAM for address translation, but the page directory would be much smaller than a single exhaustive page table.) Moreover, page tables for unused regions of memory need not occupy even virtual memory.

Problem V4.1

Describe at least two advantages of a virtual memory system over using the actual DRAM addresses.

It provides a larger range of memory address beyond what actual memory the computer holds.
It allows each process to have its own address space.
It enables two processes to share the same memory easily.
It enables memory to be moved between addresses rapidly, since altering the paging table moves the page without any actual memory copying going on.

Problem V4.2

Recall that Linux's fork system call involves creating a new process whose memory is a complete copy of the process initiating the system call. However, copying many megabytes of memory from one location in RAM to another can take quite a bit of time; this cost is particularly wasteful when a program (like a shell) promptly uses execvp after fork to replace all of the duplicated memory with a program loaded from disk. Explain how this cost can be alleviated (or at least deferred) using virtual memory.

The OS can just duplicate the process's page table, marking all pages in both processes as being read-only. When either process attempts to alter memory, the CPU will trigger a write-protection fault, and at that time the OS can duplicate the page and mark each copy as writable. Thus, the copying will only occur as pages are modified; and if the child process's memory is quickly replaced (as with execvp), it will never need to be copied.