04 Cache Organization


[MUSIC]. Hello again. Now that we know cache basic, principle of locality and how to propose hierarchies, let's look inside caches now and see how they're organized. And I'm going to divide this part of the section into two, just to make shorter videos for you. So, the first question is, where should you put data in the cache? Supposedly we have a very simple memory. Kay, of course addresses go from 0000 in binary to 1111. Kay from 0 to address 15, kay? And here's our cache, and as I said before, the cache holds a sub set of memory. So, the cache is going to be smaller than memory. In fact they're fast because they're small. One of the main reasons that they're fast is because they are small. The technology in how to do it is also different, but. Okay, so alright, memory is bigger than the cache. And that means that we have to somehow map data in memory to where it goes in the cache. Since memory is bigger, of course there'll be different memory locations that map to the same place in the cache. 'Kay, for example, one way we can go about doing that is just somehow use the address in computing index, so location in the cache where the data goes. And for example, address 00, we go to index 0. Address 1101, we go to Index 01. Address 1010, we go to Index 10. And address 0111 would go to Index 11. So, do you note something, do you notice something on how we compute why the data goes into the cache? Well, it's just module arithmetic, right? I got the memory address 'kay? I get the, the memory address and do mod 4 'kay? Which is the same thing as getting the low order bits, and use the two lower order bits of memory, and use that to to address the cache, 'kay? So again, if you got the two, the low order bits, of this address, I determine that they [INAUDIBLE] and so on. Now what's important data there? Okay, so suppose again that I this address here of course, we know based on the lower two bits goes here, but this address here. Also goes to the same location in the cache. Now the question is, how the processor know that the data here contains either this one or that one. Some more information has to be stored alongside the data in the cache in order to tell which block of memory stored in the cache. We lost some information in this mapping, right. Now this information. Has to be stored somewhere. Now, how do we solve that? Well, we're going to use something called tags. Tags are going to be another, you know, storage element in the cache. It will be one tag per data array in the cache, per, per, per, per data loca, per, per position in the cache. That tells what address is stored there. For example say that in our blue address here, 1101, which, again, based on the low ordered two bits, we know it goes into this index of the cache. 'Kay, goes right here. And what are we going to put here? Well, now that we know that address 1101 is sorted, I'm going to store tag 11. So now, the tag here together with the index, I can tell exactly which address was stored in the cache. So now, there's no more confusion in this multiplexing of multiple memory addresses into the same position in the cache. A very important question in a cache is what's a block, or a cache line? Remember when we talked about spatial locality. We said we would move data from memory to the cache in blocks, blocks that are bigger than a single byte. Okay and that helps spatial locality because we go, not only going to ring the piece of data actually needed by your program that's running the processor. But also data that is located close by in memory. So, uh, [INAUDIBLE] what's a block. Suppose that I say, that I have, you know a block size of two bytes okay? And I have again my memory goes from address 0 to address 15 and these are byte addresses, the others are originally kilobytes okay? Now if my system is two bytes like I just told you each block, each cache line is going to occupy, is going to be mapped, is going to map to two bytes in memory, okay? They're our line. So, with the line 0 occupies address 0 and address 1. Line 1 occupies address 2 and 3. Line 2 occupies address 4 and 5, and so on, okay? So, now the, of course, the cache, since the cache stores an entire block In an entry, in an index, okay? So, the index has to be as big, so that the data array for each index here has to be as big as the block, okay? So, for example, if I decide to store this block zero here, in the cache, I'm going to store bytes here. B1 let me, let me pick the green because it is a byte b1 and byte b2. When it is stored in the cache, if I happen to store this one, I'm going to store by byte b1 and byte b2. So, the entire block has to be stored in the cache. So, the data array, each entry in the data array of the cache has to be as large as the block. By the way, typically, cache block sizes are either, you know, 32 bytes is typical, 64 bytes is also typical, okay? That's, that's a typical size. This is just a simple example 2 bytes to make it simple. But in real processors is this 32, 64 bytes. 'Kay? Let's think a little about a puzzle now, that might help you with one of the assignments. If I ask you the following question. 'Kay? Start with the assumption that the cache is empty. 'Kay? And then, I'm going to to say now the sequence is encoded in the following way. I'm going to give you an address, comma, whether it was a hit or a miss. If I tell you that I access address 10, and it's a miss, okay? Then, I access address 11, address 11, and it's a hit. Then, I access address 12 and it's a miss. Hmm, the real, the real piece of information here is that, why is this a hit. Well, address eleven has never been accessed before, because the cache started empty. So, but address 10 was, and it was a miss. So, 11 can only be a hit is because it was brought together with address 10. So, that means that I know that in my cache block here somehow I'm going to have address 10 and address 11. But I know, that 12 is not part of this block. So what we can say here, what we can infer, is that our block size is at least two bytes. Because it brought 10 was a miss for sure brought the data and 11 was a hits, okay? Think a little bit more about that, with that you can actually this going to help yourselves one of the assignments, okay. So let me talk about one way of mapping data from memory to the cache. The example that I gave you earlier, is what we call direct mapped caches okay? So, direct mapped means that, for each memory location, there's a single place in the cache where the data can go. Okay? That's why it's called direct mapped, there's a direct mapping for each memory location I know exactly where it goes into the cache. Okay? So, this example here, both of these addresses go to the same index. Again, because they have the same low-order bits of the address. Okay. So what's the problem with this? Well, if I tell you that I want to access address 2, 6, 2, 6, 2 and so on. What happens? Well, this is address 2 and this is address 6. They all map to the same position in the cache. So, when I access 2, okay? I'm going to put it in the cache, when I access here, I'm going to put it in the cache. But then when I access 6, this is going to be a mess. But when, when I access 2, this is also going to be a mess because 6 has kicked 2 out. So, when I access 2 again, it is going to be a mess. So, the bottom line here is that all of these accesses are going to be a mess. Even though I only using this one position for cache, why is this. Well we say that whats going on here is a cache conflict. Since two locations that a max is repeatedly mapped to the same entry, the only place they can go into the cache. They going to kick each other out, repeatedly. So, I'm going to have a bunch of misses, even though the rest of my cache could potentially be empty. And the way we solve this problem is using something we call associativity. And this is basically the question. What if we could store data in any place in the cache 'kay? But, that could be possible, it could make data go anywhere in the cache, but that slow down caches, because the sequence gets more complicated, 'kay? So, I'm going to tell you why a little bit later. So, we do something in between, 'kay? So, suppose that I. Say that I have cache that has you know, eight positions, eights blocks here. There's multiple ways of organizing it. 'Kay? One way is to use eight sets and have one block per set, this will be called direct mapped. 'Kay? So this one I just showed you, direct mapped. now on the other extreem is fully, fully associative which means the data can go anywhere in the cache. Okay? So, direct mapped again, the data can either go here, or here, or here and so on. Okay? So, either go to one place only, not to multiple places. Whereas in, in fully associative, we can go anywhere here. So, since this one is bad, because of conflict and this one is bad because it's too expensive, let's see what we can do in between. There's something we call set associative caches. Yes, I could say if my cache is set two a set associative, for example, I'm saying that there are two places in a set where data can go. So for example, in, in our previous example, I showed you that two addresses would collide. Right, they would both go to the same place in the cache. Now they can go to two places, they can put address a here and address b here. They can both live together. Okay? So, now and this has multiple flavors. This is a two way. If I say it's four way, there's four locations in my set where data can go. It means that I'm only going to have conflicts in my cache if I'm going to, if I have at least, if, if, if I have more than four locations that I tend to access repeatedly. That maps into the same set okay? An important question in such a [INAUDIBLE], where does the data go and let me tell with a very simple example. First thing to know is that we're going to divide an n bit address into three components. One is a number, a certain number of bits to tell which part of the block the address refers to, the offset within the block. Then we're going to have some bits here, k bits, to tell me which index, which set, the data goes in the cache. And then whatever is left I'm going to use a sort of tag. Remember the tag's necessary to tell, where the data what data's actually stored in the cache. So if I have an, in my example here, suppose that I have a four byte four block cache. And then I have one I, I have two bytes per block. How many bits do I need for the offset? Well, only one bit, because there's only two bytes within a block. Okay? Now let's say that our cache is direct mapped. Direct mapped. If I said that I have four blocks, how many bits do I need? Well, I need two bits for the index. If I have a four bit address, how many bits have I for the tag? Just one. Simple enough, right? So, let's see more examples in set-associative caches. That was in the direct map cache, 'kay? So, suppose that I have a different address here now, I have the address 0x1833, okay? The block size now is 16 bytes, it's a more realistic block size than just two bytes, okay? And first thing is, 1833 in [UNKNOWN], in binary is something, something, something, something. 0110011, okay? So, how many, how many bits do I need for the block offset? Well, if I have 16 bytes in my block, I'm going to need four bits in my offset, okay? So now, how many bits do I need for the index? Well. It depends on the associativity. If I have, say, if I have one way associativity, that means I'm going to have eight different sets. If I have eight, eight different sets, I need three bits for the index. Now, if only have four sets, I need only two bits for the index. And if I have four way associativity, I'm going to need only two sets, that means I need only one bit for for the index. And whatever is left has to be used for the tag. Okay. [BLANK_AUDIO]. So now, this at, let's, let's think about the first one, so here, okay, so the first one is k equals 3 means 3 bits for the, for the index. If I have, if this is my offset, because of four bit offset. And see these three bits here have to be part of the index. So, that's why 011 is three, okay. But now, if I'm talking about k equals two, I'm going to use only these two bits, okay. Which is, again, also three. So that's where it goes in my four, in my two way, associative cache. Now four way associative cache you have only one bits only this one bit here. okay? is going to be used to determine what index, what, what set or what index what is the value of the index in other words what set the data goes. Lets look about block replacement now. Okay. So we know that multiple locations in memory can map to the same location in the cache, therefore, in in a direct associates, in a direct map cache. That means that, see, if you put a piece of data A here, and later, I want this, I want to access B that goes to the same place, I have to kick out A to store B. There's only one location, always, so that's an easy choice in what to replace, there's no choice, you're just going to replace it, right? But, if I have a two way sets-associative cache that's a two place I could put address a here, it just be here, suppose that they map to, to the same set, 'kay? But now the question is, what happens when I access C, in, C also happens to be mapped to the same set. I have to choose whether I, I, kill, I, I remove A or I remove B. The way caches do this normally is to replace what was least recently used. Because that maximizes temporal locality. Do whatever's most-recently-used should stay in the cache. Okay, so that means the least recently used zone that was used furthest in the past should be removed from the cache. So, lets see that the x is over here was our x-axis one, first and then B later. Okay, so B was accessed more recently, means that when I put, when I access C, I'm going to kick A out and out C here. Because A was accessed, further, furthest in the past. OK. So, implementing exact least recently used in hardware for set sizes, that are larger than two, is pretty hard, it uses an approximation called not most recently used. You can read more about this online or in the book. Okay, so lets end this video with another puzzle which might help you with assignments too, okay? Suppose that the cache starts empty, okay? And again, we have the, the same axis team I showed before. The same notation, address, whether it was a hit or a miss, okay. Suppose that I access 10 and it's a miss, I access 12 and it's also a miss. But then when I access 10, happens to be a miss again. Well, I just accessed 10 right here, so what happened? Well, probably happened that, well probably happened for sure. When access 12 I kick 10 out. So taht means that 10 and 12 map to the same so mapped to the same set? And 12 displaced 10. But since the cache was empty, that means that for sure this has to be a direct mapped cache. It's cool isn't it? Have fun with this and have fun with the assignment.

Wyszukiwarka

Podobne podstrony:
dysleksja organizacja pomocy w szkole
chemia organiczna2
00 Notatki organizacyjne
20 Organizacja usług dodatkowych w zakładzie hotelarskim
Elementy wymagan organizacyjne
Nie wspierajcie ekumenicznej organizacji Open Doors!
Elementy struktury organizacyjnej i zarządzanie projektowaniem organizacji
Active Directory omówienie domyślnych jednostek organizacyjnych
skały charakterystyka (folie) 2 skały pochodz organicznego
Organizm człowieka
Organizacja i zarzadzanie 01
Umowa YouTube z żydowską masońską organizacją o kontroli internetu

więcej podobnych podstron