Now that we've seen how indirection can
help us with virtual memory, let's see
how we use a VM as a tool for caching.
A different kind of cache than we've seen
before but a cache nonetheless.
So, let's think about the virtual
memories placed in the memory hierarchy.
virtual memory is an array, contiguous
array in memory of two to the n bytes.
Where n is the little n here is the
number of address bits that we have in
our architecture.
So n to the 64 for our case.
Alright, that's a, that's a very large
amount of, of space 16 X a bytes is the
value for big N here.
the physical main memory you can now
think of as a cache.
The dynamic ram in the co, in, in, our
computer system as a cache for that giant
virtual memory array.
and that's going to be a much smaller
amount of memory as we mentioned before
might be on the order of a few gigabytes.
So we're going to arrange this virtual
memory into blocks, just like we did in
our cache before, between, our, our cache
from, main memory to the CPU.
in this case the blocks are going to be
called pages, and it's really the same
concept, it's just slightly different
terminology.
These were different communities that
came up with these.
And a page is going to be of a certain
size, capital p or two to the little p,
bytes, the log of big p.
Okay, and typically the size of these
blocks is or pages is on the order of
kilobytes.
rather than on the order of just a few
dozen bytes like we've seen before.
Okay, so looking at virtual memory as
this big array of, of space, 16 exabytes,
we're not going to use all of it.
Many of the pieces of virtual memory,
these virtual pages may be unallocated.
In other words, just space we could
address but we don't have anything there
and we don't really care about it.
other parts of it are going to be cached
in the physical memory at a particular
location in that physical memory.
and that's that mapping we need to keep
track of, what's going to be in our
memory management unit.
How does this particular page map to the
physical page, virtual page to physical
page mapping.
And you notice of course these pages
could go anywhere in the physical memory
where there's an empty page.
Of course, some of the blocks like this
one sh, like these shown in gray, are
uncached.
In other words, they are space we're
using in virtual memory, but we haven't
gotten it to our physical memory yet.
We haven't found a place to put it, or we
haven't needed it yet.
And so it's still sitting there on disk.
We haven't yet moved it to the dynamic
ram, the physical memory of the CPU, of
the computer system.
Alright, now why do we bother with this
kind of caching?
Well, for the same exact reason we did
before.
remember, before we had the problem of
going from main memory, to the CPU, to
the registers, the fast memory inside the
CPU.
And the problem was, that we wanted to
keep, the fat, the stuff we need the
most.
Here at our CPU, in as fast a memory as
possible.
So we created these layers of caches
here.
We talked about a level one and a level
two cache, and data cache and instruction
cache, that, make it much faster to
access parts of memory and get it into
our registers.
If we can keep the right things in this
cache, okay.
Now in this case what we're talking about
with virtual memory is this much larger
disk out here.
that has our entire virtual memory
potentially on it and how it maps into
the main memory.
Okay.
So we're caching parts of the disk in the
main memory.
So that's that page that we're copying
from the disk into the main memory and
making it faster to access.
Because the disk is much, much slower,
alright.
exactly the same situation when we had a
smaller block of main memory that we were
copying into the L2 cache or even into
the L1 cache.
And that was another mapping different
kind of mapping that we were keeping
track of before.
So we have two levels of caches here.
Virtual memory deals with the main memory
to disk.
Okay.
Now, while in the past we were dealing
when we would, we saw our memory cache,
we were having a memory miss penalty of
33 times a regular access.
That was a big penalty when we had to go
to main memory rather than finding a
result in our cache.
And the reason for that was, because
these caches are often on the same chip
as the CPU, they're much quicker to get
to than going off chip to the larger main
memory.
Now the situation between main memory and
disk is actually even much of much worse.
The difference is 10,000x.
Disks are way, way slower but of course
they can hold much, much more data so
they're very useful.
But are much, much slower than even, than
main memory.
and so in this case we want to definitely
be using some kind of caching.
And in fact we're going to do things a
little bit differently.
than we did for the l one l two cache.
because of that huge difference in, in
missed latency.
Okay, so in DRAM as we said about ten x
slower than the static ram inside of our
CPU.
Now we're talking about a disk that is
10,000 times slower than the dynamic ram
that make up the main memory.
But it's only slow for that very first
byte, then it's much, much faster for the
next byte because we aren't going to move
one byte at a time.
We're going to move an entire block or an
entire page at a time.
But, what is this difference in speed
here,
Now 10,000 x instead of 10 x have on the
size of the block the associativity that
we want in our cache.
And whether we want it to be write
through or write back.
those were, remember all decisions we had
to make in our cache organization.
Okay, so in the case of virtual memory.
we're going to see that our page size is
much larger, the block size is much
larger.
Typically four to eight kilobytes, could
even be as high as megabytes.
another thing that we're going to do is
make the cache fully associative.
in other words, we're not going to worry
about direct map caches or set
associative caches.
We're just going to make it fully
associative.
And that way we can place any virtual
page in any physical page.
This of course, will requires a more
sophisticated and larger mapping
function.
different than those CPU caches that we
could use certain bits to go to the right
set, and then do some tag matching within
that.
We're going to have to do it a little bit
differently.
But we're also going to have the option
of having much more sophisticated
replacement algorithms because of that.
We can decide where to put any physical
page so we can be smarter about that and
make sure we keep the physical pages
we're likely to use again.
And then the last thing is that we're
going to use write back rather than write
through because the block size so large.
Write through would be very expensive.
Every time we would write one word in
that memory, we would have to write the
entire block back and that could start to
add up.
and be a little too expensive.
So instead, we're going to wait until we
need to push that block out of the cache,
out of our main memory and then write
back the whole thing.
OK, so we'll be using write back rather
than write through for virtual memory.
Let's go back to our architecture diagram
then of how we will implement this
mapping.
Remember, we said the CPU will be
generating virtual addresses that will go
through a mapping function in our memory
management unit.
which will find the corresponding
physical address in memory, and then
accessing the main memory.
so how do we perform this virtual
address, to physical address translation?
That's our next sub topic here.
the way we're going to do it is using a
construct called a page table.
A page table is going to consist of a set
of an array of page table entries that do
this mapping of virtual page to physical
page.
Okay?
So what we're going to use is our virtual
address, or part of our virtual address
to index this table.
Find the right page table entry which
will then contain the physical page
number.
for where to go in main memory, or tell
us where on the disk that we can find
that page so we can go get it and bring
it into main memory.
And find the place for it.
So this is a table that is now, in
another part of memory.
A part of memory that we're going to
keep, we're going to try to keep around
in our physical memory so we don't have
to go find that on disk as well.
And we'll see later how to take care of
that.
But for now think about this, page table
as being the place where we go and look
up in that directory, the mapping between
virtual and physical address.
Okay.
So let's say we have these, seven virtual
pages, on our disk and we have.
space for four physical pages in our
physical memory, okay?
what we're going to want to do is create
this mapping.
You see here that only four of the
virtual pages are of course resident in
memory, that's all that could fit.
So four of these page table entries have
actual addresses.
physical addresses that go to the
physical memory.
then there's another two another two
pages that we are using.
but are not resident in the physical
memory.
They're simply on the disk.
And then there's another page table entry
which is or another two page table
entries here that are not allocated.
not used for anything and we'll never
have to worry about those.
But they, fill up part of our page table.
Okay, so here again those two.
gray colored pages have addresses to disk
rather than to the physical memory.
And those addresses are of course of a
very different structure, they're not 64
bit addresses, they have to index a
position on the disk and that's done in a
very different way.
That you would see then in an operating
systems course, for example.
So how many page tables are there, in
the, in our computer system?
Well, let's think about that.
we're going to need a page table for
every virtual memory and we know that
every process, is going to require a
virtual memory.
So they'll be one page table for every
single process.
Alright.
So let's go look at that address
translation again.
we're starting here at the top with a
virtual address and we're going to use
part of that virtual address, the virtual
page number.
To index our page table.
Now why are we using only part of the
address?
Well because another part of the address,
just like with regular caches, is
going to be an offset within a block or
within a page.
we call that the VPO, the virtual page
offset.
So we don't need those bits, we're
going to use the virtual page number, the
high order bits.
To index into the page table.
there we will read the valid bit that
will tell us whether this page is already
in physical memory or not, whether it's
allocated or not.
And if the page is not in memory, it will
cause a, what's called a page fault.
equivalent to a cache miss.
which will tell us hey you got to figure
out a way to get this page into memory
first and then try again.
but if that entry is valid then we will
see in the page table a virtual page
number.
which then becomes the part of the
physical address that we append to.
The physical page offset, which is just
going to be the virtual page offset
coming straight down.
because of course that offset is within a
block and we're doing things with blocks
at a time.
we're not cutting up blocks or anything
like that, so that offset stays the same.
Okay, so that's the basic address
translation story.
Now, where do we go find this page table
though, to go look up that virtual page
number.
Well, to do that, we're going to use
what's called a page table base register.
This is a special register.
that, the CPU is going to maintain for a
process.
That is going to have these put into it
by the run time system, by the operating
system.
Of where this page table is resident in
memory, okay?
And for each computer system and
operating system combination there's
usually a standard place in memory where
that occurs.
Now the important thing to understand is
that all of this is done by the memory
management unit in hardware it doesn't
require any software systems or any
knowledge of the program.
This stuff is all going on in the
computer system automatically.
Alright, let's see what happens on a page
hit.
On a page hit, we use our virtual address
to index the page table.
We find that it is valid and it provides
us with an address in physical memory and
we now know where that page is and can
access it using physical addresses.
Pretty straightforward.
we've had to do a, access of memory to
get this page table entry.
And then a second access to memory to
actually get the data we want at that
physical address.
A page fault occurs when we access a page
that is not valid, and therefore we, it's
address.
Is going to be potentially an address on
the disk or it could be just an
unallocated page.
in which case we have a more serious
problem, we're trying to access a part of
the memory we haven't allocated at all.
but let's say its it is allocated memory,
but the page is still on the disk here
for example.
and not yet in memory.
So what happens in this case?
Here we'll have to take that page, move
it to somewhere in physical memory.
And wait a minute, where do we move it?
Its physical memory is entirely occupied.
Well, we're going to have to decide which
of those virtual pages to push out of
main memory.
And of course that could create a future
page fault later if we tried to access
that page we just evicted, again.
So we have to be careful, choose
carefully which page to push out.
That's, that replacement algorithm, for
this cache.
the way this looks in, in memory is in,
in the execution of the process is that
the process will be coming along and
eventually hit an instruction that
accesses memory that, causes that page
fault.
So we get an exception.
We have to make room for that page, load
it into memory and then we're going to go
back to that instruction and execute it
again.
But, the situation is going to be
different now That page will now be in,
in physical memory.
The page table entry will have been
modified and will get a page hit, and can
proceed normally through the rest of our
process.
So this is the basic, situation with a
page fault.
Alright, the key things to remember is
that the page handler has to load the
page into physical memory.
The page lan, handler is a bit of code in
the operating system that knows how to
replace pages in physical memory.
And then we return to the faulting
instruction, in this case executing that
move instruction yet again.
And this time we'll be successful on this
second time through because we've moved
the page to memory.
Let's look at that again in finer detail.
Our virtual address goes to the page
table.
the corresponding page table entry.
Finds that our page is invalid and is
resident on disk.
so the next thing it will have to do.
this causes the page fault, the exception
and now our operating system will kick in
and figure out where to go get that page
on the disk, bring it into main memory
and find a place to put it.
So what it first has to do is select a
victim page to evict.
in this case it's going to pick that last
page in physical memory virtual page
number four and it's going to remove that
from the physical memory.
How does it remove it from memory?
Well it doesn't really remove it, it just
simply changes the page table entry to no
longer point to it from the location for
the virtual address four and the page
table.
Instead, what it will do is have virtual
page three point to that physical
location, copy that page from the disk to
the main memory.
To the physical memory.
And now you'll see that virtual page four
just points to its copy on disk.
Which may have been put there from the
main memory if that page had to be
written back, if it had been modified by
some write statements write instructions
before then.
Okay, so, now at this point we can
execute the instruction again.
We leave the operating system and just
re-execute our instruction that caused
the page fault in the first place.
And this time of course it will be a hit,
it will find a valid page.
Because we made those changes to the page
table entry.
And updated our physical memory and
virtual memory on disk.
Okay, so how, why does this work?
this works for the same reason caches CPU
caches work.
because of locality.
if we're accessing a chunk of memory,
chances are we're going to access other
things nearby that place in memory.
Other words near the one we're initially
accessing.
So this is the same reason that those l
one l two caches work between the CPU and
main memory.
the set of virtual pages that a program
is actively accessing at any point in
time.
The pages it's kind of using at any short
window of time is called the working set.
In other words, those are the pages that
should be resident in memory together,
because the program is quickly changing
between accessing different ones.
We want that working set for our typical
program to be much smaller than the main
memory.
So that we can fit the working set in
main memory at once.
This gives us good performance because it
means our program, our process, can run
for a while without causing page faults.
It finds all of the pages in memory that
it needs in the main memory.
Okay, and that's our, that's our goal.
however of course sometimes we have many
processes running and all of their
working sets can't be in memory at the
same time.
So their total working sets is often
larger than main memory.
And as we add more processes, try to run
more programs at the same time it looks
like things slow down.
And why do things slow down?
This happens on your machines all the
time.
You have too many windows open, too many
different programs running.
Things, things seem to get sluggish.
The reason they get sluggish is because
of a phenomenon called thrashing.
Where one process is trying to act to
access a page in memory.
It has to brought in from disk.
That evicts another page out of memory.
that happens to be the, something that
another process you're running needs.
So as soon as that one gets a turn, it
goes and gets that page back from disk
and brings it back to main memory.
And then the next process goes and gets
another page from disk.
And if there's too many of these running
at the same time in other words those
working sets don't all fit in main memory
then we're constantly juggling.
What's in main memory?
What's on the disk?
What's in main memory?
What's on the disk?
And, we end up swapping and copying
pages, all the time and when we don't
that more and more, we get this
performance meltdown called "thrashing."