4

6

## Memory Management Part II

- · Virtual memory
- Paging and page replacement
- · Modeling page replacement algorithms
- · Design issues for paging systems
- Implementation issues
- Segmentation

Operating Systems

## Virtual Memory (VM)

- Basic idea
- Paging
- Demand Paging and its performance
- Page Replacement and Page-Replacement Algorithms
- Allocation of Frames
- Thrashing problem
- Segmentation: pure segmentation, Segmentation with paging

Operating Systems

## Basic Idea of Virtual Memory

- Basic idea behind the virtual memory is that the combined size of programs, data, and stacks may easily exceed the amount of physical memory available.
- · This is more true in multiprogramming environment.
- The operating system uses physical memory together with secondary storage to solve this problem.
- Virtual Memory seems to be a state of art universal method. Processes are allocated physical memory at a point they need based on the availability.

Operating Systems

3

5

## Page Faulting

- During the execution, a reference to virtual address is first checked if it has already been mapped to a physical address.
- If not it is said ta cause a "page fault".

#### **Paging Concept in Virtual Memory**

- Logical memory is divided into blocks of same size called **pages**.
- Physical memory is divided into fixed-sized blocks called **frames** (size is power of 2, between 512 bytes and 8192 bytes).
- Pages are needed to be mapped to frames during the execution.
- To run a program of size n pages, need to use *n* frames.
- OS need to manage the page-to-frame mapping dynamically.

Operating Systems

## Page Fault Algorithm-1

- If there is a reference to a page (every address reference is a reference to a page),
  - It will cause a trap to OS, which will in turn cause "page fault" processing.
- Page fault processing:
  - Find an empty frame
  - Swap page into that frame
  - Reset page's validation bit.
  - Give control back to the process causing the trap (restart instruction)

10

## Page Fault-2

- If there is no free page, OS starts page replacement procedure:
  - Page replacement: find some page in the memory, the one which is not really in use, to replace.
  - Conduct the swap operations.
  - For performance reasons the algorithm used should result in minimum number of page faults.
  - It is possible that the same page is brought into memory several times, during execution of a program...

Operating Systems

7

9

11

## Address Translation-1

- The virtual address generated during the execution is composed of two parts:
  - page number (p)
    - used as an index into a *page table* which contains **base address** of the page in physical memory (**frame**).
  - Page offset (d)
    - combined with base address to define the physical memory address to be referenced.

Operating Systems

## Address Translation-2

- The base address is the address of the frame to which the page is mapped.
  - if the virtual page is already in the memory, the address mapping is straight forward and very efficient as it is done in firmware.
- If the page is not in the memory, first a **page fault** occurs, after which the address mapping is as before.

Operating Systems

The CPU sends virtual addresses to the MMU package CPU Memory Memory Disk controller The MMU sends physical addresses to the memory

The position and function of the MMU

## Page-frame mapping: Example



Operating Systems



## MMU address mapping



Internal operation of MEMU with 164 KB pages 12

#### Virtual address and MMU-Memory Management Unit

## Implementation of Page Table

- Page table is kept in main memory.
- Page-table base register (PTBR) points to the page table.
- *Page-table length register* (PRLR) indicates size of the page table.
- In this scheme every data/instruction access requires two memory accesses. One for the page table and one for the data/instruction.
- The two memory access problem can be solved by the use of a special fast-lookup hardware cache called *associative registers* or *translation look-aside buffers* (*TLBs*)

Operating Systems

## Associative Register

• Associative registers - parallel search

- Address translation (A', A'')
  - If A' is in associative register, get frame #.
  - Otherwise get frame # from page table in the main memory

| Operating Systems |    |
|-------------------|----|
|                   | 14 |

## Associative Register

- *Valid-invalid* bit attached to each entry in the page table:
  - "valid" indicates that the associated page is in the process' logical address space, and is thus a legal page.
  - "invalid" indicates that the page is not in the process' logical address space.

## Effective Access Time

- Associative Lookup =  $\varepsilon$  time unit
- Assume memory cycle time is 1 microsecond
- Hit ratio percentage of times that a page number is found in the associative memory =  $\alpha$
- Effective Access Time (EAT) EAT = f(PageTableAccessTime,MemoryAccessTime)

 $EAT = (1 + \varepsilon) \alpha + (2 + \varepsilon)(1 - \alpha)$  $= 2 + \varepsilon - \alpha$ 

Operating Systems

13

15

17

### Multilevel Paging and Performance

Operating Systems

- Since each level is stored as a separate table in memory, mapping a logical address to a physical one may take four memory accesses.
- Even though time needed for one memory access is theoretically four times as much, caching permits performance to remain reasonable.

Two-level page tables



18

#### Multilevel Paging and Performance: Example

- Cache hit rate of 98 percent yields: mem.access=100 nsec, cache.access=20 nsec, effective access time = 0.98 x (100+20) + 0.02 x (400+20)
- = 126 nanoseconds.

which is only a 26 percent slowdown in memory access time.

Operating Systems

## Page Tables Entry Format



#### Typical page table entry

| C | Operating Systems | 20 |
|---|-------------------|----|
|   |                   |    |

#### TLBs - Translation Lookaside Buffers

| Valid | Virtual page | Modified | Protection | Page frame |
|-------|--------------|----------|------------|------------|
| 1     | 140          | 1        | RW         | 31         |
| 1     | 20           | 0        | RX         | 38         |
| 1     | 130          | 1        | RW         | 29         |
| 1     | 129          | 1        | RW         | 62         |
| 1     | 19           | 0        | RX         | 50         |
| 1     | 21           | 0        | RX         | 45         |
| 1     | 860          | 1        | RW         | 14         |
| 1     | 861          | 1        | RW         | 75         |

A TLB to speed up paging

Operating Systems

Inverted Page Table Architecture

## Inverted Page Table

- One entry for each real page of memory.
- Entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page.
- Decreases memory needed to store each page table, but increases time needed to search the table when a page reference occurs.
- Use hash table to limit the search to one or at most a few page-table entries.



# 

Operating Systems

23

19

21



Comparison of a traditional page table with an inverted page table Operating Systems 24

28

## Shared Pages

- · Shared code
  - One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers, window systems).
  - Shared code must appear in same location in the logical address space of all processes.
- · Private code and data
  - Each process keeps a separate copy of the code and data.
  - The pages for the private code and data can appear anywhere in the logical address space.

Operating Systems

#### Shared Pages Example: Editor is shared



#### **Demand Paging** · It means bringing a page into memory only when it is needed. - Less I/O needed - Less memory needed - Faster response More users **DEMAND PAGING** • Page is needed ⇒ reference to it - invalid reference $\Rightarrow$ abort not-in-memory ⇒ bring to memory · Page replacement - find some page in memory, but not really in use, swap it out: algorithm performance - want an algorithm which will result in minimum number of page faults. · Same page may be brought into memory several times. Operating Systems Operating Systems 27

25

## Performance of Demand Paging

#### • Page Fault Rate $0 \le p \le 1.0$

- if p = 0 no page faults
- if p = 1, every reference is a fault
- Effective Access Time (EAT):
- EAT = (1 p) \*memory access time
- + p\*(page fault overhead+ swap page out
- + swap page in + restart overhead time)

#### Demand Paging: Example

#### Assumption: No associative memory

- Memory access time = 1 microsecond
- Page fault rate p=50%
- q (=50) % of the time the page that is being replaced has been modified and therefore needs to be swapped out.
   Swap Page Time = 10 msec = 10,000 microsec
- Computation of Effective Access Time:
  - EAT = (1 p) \* 2 + p (2+q\*10000+(1-q)\*20000))
  - =(1-0.5)\*2+0.5(2+0.5\*(10000)+(1-0.5)\*20000)
  - =1+1+2500+5000
  - =75002 micro sec

Operating Systems

29

Operating Systems

34

## Page Replacement Algorithms

- Page fault process:
  - Find the page to be removed or
  - make room for incoming page
- Modified page must first be saved before being overwritten
  - If unmodified can be overwritten
- Better not to choose an often or recently used page.
  - It will probably need to be brought back in soon!

Operating Systems

## Page Replacement Algorithms

- The real algorithms in use are generally combination of several approaches. The discussion of few of the ones listted below is more an academic one:
  - Optimal
  - Not recently used
  - FIFO

31

33

- Second Chance
- The Clock
- Least Recently Used

- Simulating LRU

#### **Optimal Page Replacement Algorithms**

- Replace the page needed at the farthest point in the future
- Normally, given a proram, this is theoretically impossible to know. However it can be predicted by some means.
  - For example, logging page use on the previous runs of a process can be replayed for the future executions of the same program with same data...

Operating Systems

## FIFO Page Replacement Algorithm: Implementation

• Maintain a linked list of all pages

- in order of their coming into the memory

- When required, the page at beginning of list is replaced.
- Disadvantage
  - page in the memory the longest time, may be the most often used one! Yet it will be replaced, causing extensive page fault...

Operating Systems

35

#### Not Recently Used Page Replacement Algorithm: Implementation

Operating Systems

- Each page has Reference bit, Modified bit
  - R ad M bits are set when page is referenced and or modified
- They are classified according to the combination.
- 1. not referenced, not modified
- 2. not referenced, modified
- referenced, not modified
- 4 referenced, modified
- NRU removes page at random from lowest numbered non empty class

#### Second Chance Page Replacement Algorithm: number indicate the last reference times



- The candidate page to be replaced is the one at the tail of the list. The replacement depends on the reference bit.
- The page referenced, has its R bit set to 1. If a tail page has it is R bit set, it is reset and moved to the head.
   If the tail page has its R bit 0, it is replaced, R bit is set to 1 and moved to the head.
- If the fail page has its R bit 0, it is replaced, R bit is set to 1 and moved to the head.
   At page fault at time 20, <u>A</u> has R bit is reset: it is treated as if it has just been brought in
- At page fault at time 20, <u>A</u> has R bit is reset: it is treated as if it has just been brought in (numbers above pages are reference times), with R bit cleared. Operating Systems

40

## The Clock Page Replacement Algorithm



## Least Recently Used (LRU)

- It is based on the assumption that the pages used recently will be used again soon
- Write out the page that has been unused for the longest time
- Must keep a linked list of pages – most recently used at front, least at rear
  - update this list at every memory reference !!
- Alternatively keep counter in each page table entry - choose page with lowest value counter

Operating System:

- periodically zero the counter

## Least Recently Used (LRU):bit matrix

- LRU maintains an nxn bit matrix in hardware
- On page reference, the corresponding row is set to 1, column to 0
- At any instant the lowest value row is the LRU

Operating Systems

## n x n LRU in hardware



39

41

## Simulating LRU in Software (1)

- Hardware may not exist for the architecture
- · So a software solution is more practical
- Implementation of a counter that includes the effect of the aging
- Shift the counters right before the R bits are added on a reference
- This is repeated at every clock tick: each time reference bits are cleared, reference counters are shifted...

#### Operating Systems

## Simulating LRU in Software(2)



Operating Systems

#### The Working Set Page Replacement Algorithm (1)

- WS is the set of the pages that are currently in use
- WS page replacing algorithms make use of the locality of the reference
- If the entire WS of a process is in the memory, no page fault occurs until the process moves to a new locality.
- · The pages will remain in a WS as long as their reference ® bit is 1.
- Otherwise, the difference between timestamp and the current time is used to decide which one to push out of WS.

Operating Systems

43

45

w(k,t) • The working set is the set of pages used by the k most recent memory references • w(k,t) is the size of the working set at time, t Operating Systems

The Working Set Page Replacement Algorithm (2)

## WS based PRA

- Which page to be excluded from a WS, in case of a page • fault:
- · Remember that R and M bits are automatically set
- R bits are cleared periodically
- · Each entry in the page table has a time stamp field and an R bit together with other inf about the page...
- · On page fault, if the R bit is set, the clock time is written to that entry, otherwise it is candidate for eviction, based on the difference between the time stamp and the current time...

Operating Systems



#### The WSClock Page Replacement Algorithm(1)

- · Similar to clock replacement algorithm, the pages are arrange in a circular data structure with a pointer.
- The test is conducted from the pointer on ward, which may remove the page from the WS or keeps it as is, with R bit is set to 0,
- If it is referenced its time is updated.
- If R is 0 and it is not dirty and it is not within the time  $\tau$  evict it.  $\tau$  is the age of the page, relative to the current time (=currentTime-timestamp)...
- If it is dirty and outside the window it is put on hold for eviction... Operating Systems

47

#### **Review of Page Replacement Algorithms**

| Algorithm                  | Comment                                        |
|----------------------------|------------------------------------------------|
| Optimal                    | Not implementable, but useful as a benchmark   |
| NRU (Not Recently Used)    | Very crude                                     |
| FIFO (First-In, First-Out) | Might throw out important pages                |
| Second chance              | Big improvement over FIFO                      |
| Clock                      | Realistic                                      |
| LRU (Least Recently Used)  | Excellent, but difficult to implement exactly  |
| NFU (Not Frequently Used)  | Fairly crude approximation to LRU              |
| Aging                      | Efficient algorithm that approximates LRU well |
| Working set                | Somewhat expensive to implement                |
| WSClock                    | Good efficient algorithm                       |

Operating Systems

52

## Stack Algorithms: 4 frames memory



#### The Distance String in a stack algorithm

- Distance is the number of pages that the referenced page is far from the top of the column, including itself.
- The pages that are not in the memory yet are at a distance of infinity from the top of the table

Operating System:

## Predicting the page fault rates

Operating Systems

49

51

53

- Distance string can be used to predict the page fault rates.
- Given memory size in number of frames (m) and the distance string, one can compute the number of page faults (F).
- $F_m = \sum C_k + C_{\infty} \cdot k = m + 1,...n$
- Where m is memory size, n is the number of virtual pages, and F<sub>m</sub> is the number of page faults, C<sub>i</sub> is the frequency of number i in a distance string...



## Predicting the page fault rates



- Computation of page fault rate from distance string - the *C* vector for the memory size of 4.
  - the *F* vector for different memory sizes:1,2,3,4,5,6,7,8 Operating Systems

#### Belady's Anomaly in FIFO PRAs



#### Modeling Page Replacement Algorithms

- Belady's Anomaly in FIFO algorithms:
   More frames more faults!
- Stack algorithms: all other algorithms

   More frames less faults

58

60

## Design Issues for Paging Systems

Local versus Global Allocation Policies

- Local page replacement: the replacement is local to the process, ie., one of the pages of the process causing page fault is replaced.
- Global page replacement: the oldest page in the system is replaced

## **Design Issues for Paging Systems**

- Local vs Global allocation policies
- Thrashing is eliminated using load control, externally
- · Page size can be optimized
- · Separate instruction and data spaces
- · Cleaning policy
- Sharing memory pages

#### Local versus Global Allocation Policies (1)

Operating System:



c) Global page replacement

Operating Systems

#### Local versus Global Allocation Policies (2)

Operating Systems



- Page fault rate as a function of the number of page frames assigned
- A: high page fault rate, B: Low page fault rate
   Operating Systems

## Load Control

- Despite good designs, system may still thrash
- Solution :

Reduce number of processes competing for memory

- swap one or more to disk, divide up pages they held
- reconsider degree of multiprogramming

## Advantages of small page size vs disadvantages

#### • Advantages

- less internal fragmentation
- better fit for various data structures, code sections
- less unused program in memory
- Disadvantages
  - programs need many pages, larger page tables

Operating Systems

59

55

57

## Minimization of overhead: Optimum Page Size





• Two processes may share the same program related page table with separate data and related page tables
Operating Systems

## Further Reading

## Cleaning Policy: maintenance of an acceptable level of free frames

Separate Instruction and Data Spaces may be

preferred

Single address

- Need for a background process, paging daemon

   periodically inspects state of memory to maintain an acceptable level of available pages
- When too few frames are free
- selects pages to evict using a replacement algorithmImplementation example:
  - use same clock type page handling: two handed clock, front hand does cleaning, back hand does replacement..

Operating Systems

## Segmentation

65

63

Operating Systems

66

## Segmentation vs. Linear mempory

- So far virtual memory was linear, but frames were in ad-hoc positions
- For many problems two or more memory address spaces are more convenient.
- Compilers are good examples for this with many unrelated components such as text, symbol table, parse tree, etc.

Operating Systems

# Segmentation (1)



• With one-dimensional address space with growing tables, one table may bump into another

| Operating Systems |    |
|-------------------|----|
|                   | 68 |

Segmentation (2)



Allows each table to grow or shrink, independently

## Segmentation (3)

| Consideration                                                            | Paging                                                                                        | Segmentation                                                                                                                             |
|--------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|
| Need the programmer be aware<br>that this technique is being used?       | No                                                                                            | Yes                                                                                                                                      |
| How many linear address<br>spaces are there?                             | 1                                                                                             | Many                                                                                                                                     |
| Can the total address space<br>exceed the size of physical<br>memory?    | Yes                                                                                           | Yes                                                                                                                                      |
| Can procedures and data be<br>distinguished and separately<br>protected? | No                                                                                            | Yes                                                                                                                                      |
| Can tables whose size fluctuates<br>be accommodated easily?              | No                                                                                            | Yes                                                                                                                                      |
| Is sharing of procedures<br>between users facilitated?                   | No                                                                                            | Yes                                                                                                                                      |
| Why was this technique<br>invented?                                      | To get a large<br>linear address<br>space without<br>having to buy<br>more physical<br>memory | To allow programs<br>and data to be broken<br>up into logically<br>independent address<br>spaces and to aid<br>sharing and<br>protection |

Comparison of paging and segmentation

## Implementation of Pure Segmentation



<sup>(</sup>a)-(d) Development of checkerboarding(e) Removal of the checkerboarding by compaction

## Segmentation with Paging: MULTICS (1)



- Descriptor segment points to page tables
- Segment descriptor numbers are field lengths
   Operating Systems
   72

67

## Segmentation with Paging: MULTICS (2)

- Segment page table may not be in the memory
- Page referenced may not be in the memory
- TLB is used to keep the address of the most recently used pages



#### A 34-bit MULTICS virtual address



75

77

## Segmentation with Paging: MULTICS (3)



## Segmentation with Paging: MULTICS (4)

| Is th<br>entr |     |              |               | Comparison<br>field |                   |
|---------------|-----|--------------|---------------|---------------------|-------------------|
| ļ             | Age | Protection   | Page<br>frame | Virtual<br>page     | Segment<br>number |
| 1             | 13  | Read/write   | 7             | 1                   | 4                 |
| 1             | 10  | Read only    | 2             | 0                   | 6                 |
| 1             | 2   | Read/write   | 1             | 3                   | 12                |
| 0             |     |              |               |                     |                   |
| 1             | 7   | Execute only | 0             | 1                   | 2                 |
| 1             | 9   | Execute only | 12            | 2                   | 2                 |

• Simplified version of the MULTICS TLB

Existence of 2 page sizes makes actual TLB more complicated
 Operating Systems

## Segmentation with Paging: Pentium (1)



#### A Pentium selector



## Segmentation with Paging: Pentium (2)



· Pentium code segment descriptor



78

# Selector Offset Descriptor Base address Limit Other fields 32-Bit linear address

Segmentation with Paging: Pentium (3)

Conversion of a (selector, offset) pair to a linear address Operating Systems

79

81

## Segmentation with Paging: Pentium (4)



Mapping of a linear address onto a physical address Operating Systems

## Segmentation with Paging: Pentium (5)

