03 Virtual Memory Caches


Now that we've seen how indirection can help us with virtual memory, let's see how we use a VM as a tool for caching. A different kind of cache than we've seen before but a cache nonetheless. So, let's think about the virtual memories placed in the memory hierarchy. virtual memory is an array, contiguous array in memory of two to the n bytes. Where n is the little n here is the number of address bits that we have in our architecture. So n to the 64 for our case. Alright, that's a, that's a very large amount of, of space 16 X a bytes is the value for big N here. the physical main memory you can now think of as a cache. The dynamic ram in the co, in, in, our computer system as a cache for that giant virtual memory array. and that's going to be a much smaller amount of memory as we mentioned before might be on the order of a few gigabytes. So we're going to arrange this virtual memory into blocks, just like we did in our cache before, between, our, our cache from, main memory to the CPU. in this case the blocks are going to be called pages, and it's really the same concept, it's just slightly different terminology. These were different communities that came up with these. And a page is going to be of a certain size, capital p or two to the little p, bytes, the log of big p. Okay, and typically the size of these blocks is or pages is on the order of kilobytes. rather than on the order of just a few dozen bytes like we've seen before. Okay, so looking at virtual memory as this big array of, of space, 16 exabytes, we're not going to use all of it. Many of the pieces of virtual memory, these virtual pages may be unallocated. In other words, just space we could address but we don't have anything there and we don't really care about it. other parts of it are going to be cached in the physical memory at a particular location in that physical memory. and that's that mapping we need to keep track of, what's going to be in our memory management unit. How does this particular page map to the physical page, virtual page to physical page mapping. And you notice of course these pages could go anywhere in the physical memory where there's an empty page. Of course, some of the blocks like this one sh, like these shown in gray, are uncached. In other words, they are space we're using in virtual memory, but we haven't gotten it to our physical memory yet. We haven't found a place to put it, or we haven't needed it yet. And so it's still sitting there on disk. We haven't yet moved it to the dynamic ram, the physical memory of the CPU, of the computer system. Alright, now why do we bother with this kind of caching? Well, for the same exact reason we did before. remember, before we had the problem of going from main memory, to the CPU, to the registers, the fast memory inside the CPU. And the problem was, that we wanted to keep, the fat, the stuff we need the most. Here at our CPU, in as fast a memory as possible. So we created these layers of caches here. We talked about a level one and a level two cache, and data cache and instruction cache, that, make it much faster to access parts of memory and get it into our registers. If we can keep the right things in this cache, okay. Now in this case what we're talking about with virtual memory is this much larger disk out here. that has our entire virtual memory potentially on it and how it maps into the main memory. Okay. So we're caching parts of the disk in the main memory. So that's that page that we're copying from the disk into the main memory and making it faster to access. Because the disk is much, much slower, alright. exactly the same situation when we had a smaller block of main memory that we were copying into the L2 cache or even into the L1 cache. And that was another mapping different kind of mapping that we were keeping track of before. So we have two levels of caches here. Virtual memory deals with the main memory to disk. Okay. Now, while in the past we were dealing when we would, we saw our memory cache, we were having a memory miss penalty of 33 times a regular access. That was a big penalty when we had to go to main memory rather than finding a result in our cache. And the reason for that was, because these caches are often on the same chip as the CPU, they're much quicker to get to than going off chip to the larger main memory. Now the situation between main memory and disk is actually even much of much worse. The difference is 10,000x. Disks are way, way slower but of course they can hold much, much more data so they're very useful. But are much, much slower than even, than main memory. and so in this case we want to definitely be using some kind of caching. And in fact we're going to do things a little bit differently. than we did for the l one l two cache. because of that huge difference in, in missed latency. Okay, so in DRAM as we said about ten x slower than the static ram inside of our CPU. Now we're talking about a disk that is 10,000 times slower than the dynamic ram that make up the main memory. But it's only slow for that very first byte, then it's much, much faster for the next byte because we aren't going to move one byte at a time. We're going to move an entire block or an entire page at a time. But, what is this difference in speed here, Now 10,000 x instead of 10 x have on the size of the block the associativity that we want in our cache. And whether we want it to be write through or write back. those were, remember all decisions we had to make in our cache organization. Okay, so in the case of virtual memory. we're going to see that our page size is much larger, the block size is much larger. Typically four to eight kilobytes, could even be as high as megabytes. another thing that we're going to do is make the cache fully associative. in other words, we're not going to worry about direct map caches or set associative caches. We're just going to make it fully associative. And that way we can place any virtual page in any physical page. This of course, will requires a more sophisticated and larger mapping function. different than those CPU caches that we could use certain bits to go to the right set, and then do some tag matching within that. We're going to have to do it a little bit differently. But we're also going to have the option of having much more sophisticated replacement algorithms because of that. We can decide where to put any physical page so we can be smarter about that and make sure we keep the physical pages we're likely to use again. And then the last thing is that we're going to use write back rather than write through because the block size so large. Write through would be very expensive. Every time we would write one word in that memory, we would have to write the entire block back and that could start to add up. and be a little too expensive. So instead, we're going to wait until we need to push that block out of the cache, out of our main memory and then write back the whole thing. OK, so we'll be using write back rather than write through for virtual memory. Let's go back to our architecture diagram then of how we will implement this mapping. Remember, we said the CPU will be generating virtual addresses that will go through a mapping function in our memory management unit. which will find the corresponding physical address in memory, and then accessing the main memory. so how do we perform this virtual address, to physical address translation? That's our next sub topic here. the way we're going to do it is using a construct called a page table. A page table is going to consist of a set of an array of page table entries that do this mapping of virtual page to physical page. Okay? So what we're going to use is our virtual address, or part of our virtual address to index this table. Find the right page table entry which will then contain the physical page number. for where to go in main memory, or tell us where on the disk that we can find that page so we can go get it and bring it into main memory. And find the place for it. So this is a table that is now, in another part of memory. A part of memory that we're going to keep, we're going to try to keep around in our physical memory so we don't have to go find that on disk as well. And we'll see later how to take care of that. But for now think about this, page table as being the place where we go and look up in that directory, the mapping between virtual and physical address. Okay. So let's say we have these, seven virtual pages, on our disk and we have. space for four physical pages in our physical memory, okay? what we're going to want to do is create this mapping. You see here that only four of the virtual pages are of course resident in memory, that's all that could fit. So four of these page table entries have actual addresses. physical addresses that go to the physical memory. then there's another two another two pages that we are using. but are not resident in the physical memory. They're simply on the disk. And then there's another page table entry which is or another two page table entries here that are not allocated. not used for anything and we'll never have to worry about those. But they, fill up part of our page table. Okay, so here again those two. gray colored pages have addresses to disk rather than to the physical memory. And those addresses are of course of a very different structure, they're not 64 bit addresses, they have to index a position on the disk and that's done in a very different way. That you would see then in an operating systems course, for example. So how many page tables are there, in the, in our computer system? Well, let's think about that. we're going to need a page table for every virtual memory and we know that every process, is going to require a virtual memory. So they'll be one page table for every single process. Alright. So let's go look at that address translation again. we're starting here at the top with a virtual address and we're going to use part of that virtual address, the virtual page number. To index our page table. Now why are we using only part of the address? Well because another part of the address, just like with regular caches, is going to be an offset within a block or within a page. we call that the VPO, the virtual page offset. So we don't need those bits, we're going to use the virtual page number, the high order bits. To index into the page table. there we will read the valid bit that will tell us whether this page is already in physical memory or not, whether it's allocated or not. And if the page is not in memory, it will cause a, what's called a page fault. equivalent to a cache miss. which will tell us hey you got to figure out a way to get this page into memory first and then try again. but if that entry is valid then we will see in the page table a virtual page number. which then becomes the part of the physical address that we append to. The physical page offset, which is just going to be the virtual page offset coming straight down. because of course that offset is within a block and we're doing things with blocks at a time. we're not cutting up blocks or anything like that, so that offset stays the same. Okay, so that's the basic address translation story. Now, where do we go find this page table though, to go look up that virtual page number. Well, to do that, we're going to use what's called a page table base register. This is a special register. that, the CPU is going to maintain for a process. That is going to have these put into it by the run time system, by the operating system. Of where this page table is resident in memory, okay? And for each computer system and operating system combination there's usually a standard place in memory where that occurs. Now the important thing to understand is that all of this is done by the memory management unit in hardware it doesn't require any software systems or any knowledge of the program. This stuff is all going on in the computer system automatically. Alright, let's see what happens on a page hit. On a page hit, we use our virtual address to index the page table. We find that it is valid and it provides us with an address in physical memory and we now know where that page is and can access it using physical addresses. Pretty straightforward. we've had to do a, access of memory to get this page table entry. And then a second access to memory to actually get the data we want at that physical address. A page fault occurs when we access a page that is not valid, and therefore we, it's address. Is going to be potentially an address on the disk or it could be just an unallocated page. in which case we have a more serious problem, we're trying to access a part of the memory we haven't allocated at all. but let's say its it is allocated memory, but the page is still on the disk here for example. and not yet in memory. So what happens in this case? Here we'll have to take that page, move it to somewhere in physical memory. And wait a minute, where do we move it? Its physical memory is entirely occupied. Well, we're going to have to decide which of those virtual pages to push out of main memory. And of course that could create a future page fault later if we tried to access that page we just evicted, again. So we have to be careful, choose carefully which page to push out. That's, that replacement algorithm, for this cache. the way this looks in, in memory is in, in the execution of the process is that the process will be coming along and eventually hit an instruction that accesses memory that, causes that page fault. So we get an exception. We have to make room for that page, load it into memory and then we're going to go back to that instruction and execute it again. But, the situation is going to be different now That page will now be in, in physical memory. The page table entry will have been modified and will get a page hit, and can proceed normally through the rest of our process. So this is the basic, situation with a page fault. Alright, the key things to remember is that the page handler has to load the page into physical memory. The page lan, handler is a bit of code in the operating system that knows how to replace pages in physical memory. And then we return to the faulting instruction, in this case executing that move instruction yet again. And this time we'll be successful on this second time through because we've moved the page to memory. Let's look at that again in finer detail. Our virtual address goes to the page table. the corresponding page table entry. Finds that our page is invalid and is resident on disk. so the next thing it will have to do. this causes the page fault, the exception and now our operating system will kick in and figure out where to go get that page on the disk, bring it into main memory and find a place to put it. So what it first has to do is select a victim page to evict. in this case it's going to pick that last page in physical memory virtual page number four and it's going to remove that from the physical memory. How does it remove it from memory? Well it doesn't really remove it, it just simply changes the page table entry to no longer point to it from the location for the virtual address four and the page table. Instead, what it will do is have virtual page three point to that physical location, copy that page from the disk to the main memory. To the physical memory. And now you'll see that virtual page four just points to its copy on disk. Which may have been put there from the main memory if that page had to be written back, if it had been modified by some write statements write instructions before then. Okay, so, now at this point we can execute the instruction again. We leave the operating system and just re-execute our instruction that caused the page fault in the first place. And this time of course it will be a hit, it will find a valid page. Because we made those changes to the page table entry. And updated our physical memory and virtual memory on disk. Okay, so how, why does this work? this works for the same reason caches CPU caches work. because of locality. if we're accessing a chunk of memory, chances are we're going to access other things nearby that place in memory. Other words near the one we're initially accessing. So this is the same reason that those l one l two caches work between the CPU and main memory. the set of virtual pages that a program is actively accessing at any point in time. The pages it's kind of using at any short window of time is called the working set. In other words, those are the pages that should be resident in memory together, because the program is quickly changing between accessing different ones. We want that working set for our typical program to be much smaller than the main memory. So that we can fit the working set in main memory at once. This gives us good performance because it means our program, our process, can run for a while without causing page faults. It finds all of the pages in memory that it needs in the main memory. Okay, and that's our, that's our goal. however of course sometimes we have many processes running and all of their working sets can't be in memory at the same time. So their total working sets is often larger than main memory. And as we add more processes, try to run more programs at the same time it looks like things slow down. And why do things slow down? This happens on your machines all the time. You have too many windows open, too many different programs running. Things, things seem to get sluggish. The reason they get sluggish is because of a phenomenon called thrashing. Where one process is trying to act to access a page in memory. It has to brought in from disk. That evicts another page out of memory. that happens to be the, something that another process you're running needs. So as soon as that one gets a turn, it goes and gets that page back from disk and brings it back to main memory. And then the next process goes and gets another page from disk. And if there's too many of these running at the same time in other words those working sets don't all fit in main memory then we're constantly juggling. What's in main memory? What's on the disk? What's in main memory? What's on the disk? And, we end up swapping and copying pages, all the time and when we don't that more and more, we get this performance meltdown called "thrashing."

Wyszukiwarka