[MUSIC].
Hello again.
Now that we know cache basic, principle
of locality and how to propose
hierarchies, let's look inside caches now
and see how they're organized.
And I'm going to divide this part of the
section into two, just to make shorter
videos for you.
So, the first question is, where should
you put data in the cache?
Supposedly we have a very simple memory.
Kay, of course addresses go from 0000 in
binary to 1111.
Kay from 0 to address 15, kay?
And here's our cache, and as I said
before, the cache holds a sub set of
memory.
So, the cache is going to be smaller than
memory.
In fact they're fast because they're
small.
One of the main reasons that they're fast
is because they are small.
The technology in how to do it is also
different, but.
Okay, so alright, memory is bigger than
the cache.
And that means that we have to somehow
map data in memory to where it goes in
the cache.
Since memory is bigger, of course
there'll be different memory locations
that map to the same place in the cache.
'Kay, for example, one way we can go
about doing that is just somehow use the
address in computing index, so location
in the cache where the data goes.
And for example, address 00, we go to
index 0.
Address 1101, we go to Index 01.
Address 1010, we go to Index 10.
And address 0111 would go to Index 11.
So, do you note something, do you notice
something on how we compute why the data
goes into the cache?
Well, it's just module arithmetic, right?
I got the memory address 'kay?
I get the, the memory address and do mod
4 'kay?
Which is the same thing as getting the
low order bits, and use the two lower
order bits of memory, and use that to to
address the cache, 'kay?
So again, if you got the two, the low
order bits, of this address, I determine
that they [INAUDIBLE] and so on.
Now what's important data there?
Okay, so suppose again that I this
address here of course, we know based on
the lower two bits goes here, but this
address here.
Also goes to the same location in the
cache.
Now the question is, how the processor
know that the data here contains either
this one or that one.
Some more information has to be stored
alongside the data in the cache in order
to tell which block of memory stored in
the cache.
We lost some information in this mapping,
right.
Now this information.
Has to be stored somewhere.
Now, how do we solve that?
Well, we're going to use something called
tags.
Tags are going to be another, you know,
storage element in the cache.
It will be one tag per data array in the
cache, per, per, per, per data loca, per,
per position in the cache.
That tells what address is stored there.
For example say that in our blue address
here, 1101, which, again, based on the
low ordered two bits, we know it goes
into this index of the cache.
'Kay, goes right here.
And what are we going to put here?
Well, now that we know that address 1101
is sorted, I'm going to store tag 11.
So now, the tag here together with the
index, I can tell exactly which address
was stored in the cache.
So now, there's no more confusion in this
multiplexing of multiple memory addresses
into the same position in the cache.
A very important question in a cache is
what's a block, or a cache line?
Remember when we talked about spatial
locality.
We said we would move data from memory to
the cache in blocks, blocks that are
bigger than a single byte.
Okay and that helps spatial locality
because we go, not only going to ring the
piece of data actually needed by your
program that's running the processor.
But also data that is located close by in
memory.
So, uh, [INAUDIBLE] what's a block.
Suppose that I say, that I have, you know
a block size of two bytes okay?
And I have again my memory goes from
address 0 to address 15 and these are
byte addresses, the others are originally
kilobytes okay?
Now if my system is two bytes like I just
told you each block, each cache line is
going to occupy, is going to be mapped,
is going to map to two bytes in memory,
okay?
They're our line.
So, with the line 0 occupies address 0
and address 1.
Line 1 occupies address 2 and 3.
Line 2 occupies address 4 and 5, and so
on, okay?
So, now the, of course, the cache, since
the cache stores an entire block In an
entry, in an index, okay?
So, the index has to be as big, so that
the data array for each index here has to
be as big as the block, okay?
So, for example, if I decide to store
this block zero here, in the cache, I'm
going to store bytes here.
B1 let me, let me pick the green because
it is a byte b1 and byte b2.
When it is stored in the cache, if I
happen to store this one, I'm going to
store by byte b1 and byte b2.
So, the entire block has to be stored in
the cache.
So, the data array, each entry in the
data array of the cache has to be as
large as the block.
By the way, typically, cache block sizes
are either, you know, 32 bytes is
typical, 64 bytes is also typical, okay?
That's, that's a typical size.
This is just a simple example 2 bytes to
make it simple.
But in real processors is this 32, 64
bytes.
'Kay?
Let's think a little about a puzzle now,
that might help you with one of the
assignments.
If I ask you the following question.
'Kay?
Start with the assumption that the cache
is empty.
'Kay?
And then, I'm going to to say now the
sequence is encoded in the following way.
I'm going to give you an address, comma,
whether it was a hit or a miss.
If I tell you that I access address 10,
and it's a miss, okay?
Then, I access address 11, address 11,
and it's a hit.
Then, I access address 12 and it's a
miss.
Hmm, the real, the real piece of
information here is that, why is this a
hit.
Well, address eleven has never been
accessed before, because the cache
started empty.
So, but address 10 was, and it was a
miss.
So, 11 can only be a hit is because it
was brought together with address 10.
So, that means that I know that in my
cache block here somehow I'm going to
have address 10 and address 11.
But I know, that 12 is not part of this
block.
So what we can say here, what we can
infer, is that our block size is at least
two bytes.
Because it brought 10 was a miss for sure
brought the data and 11 was a hits, okay?
Think a little bit more about that, with
that you can actually this going to help
yourselves one of the assignments, okay.
So let me talk about one way of mapping
data from memory to the cache.
The example that I gave you earlier, is
what we call direct mapped caches okay?
So, direct mapped means that, for each
memory location, there's a single place
in the cache where the data can go.
Okay?
That's why it's called direct mapped,
there's a direct mapping for each memory
location I know exactly where it goes
into the cache.
Okay?
So, this example here, both of these
addresses go to the same index.
Again, because they have the same
low-order bits of the address.
Okay.
So what's the problem with this?
Well, if I tell you that I want to access
address 2, 6, 2, 6, 2 and so on.
What happens?
Well, this is address 2 and this is
address 6.
They all map to the same position in the
cache.
So, when I access 2, okay?
I'm going to put it in the cache, when I
access here, I'm going to put it in the
cache.
But then when I access 6, this is going
to be a mess.
But when, when I access 2, this is also
going to be a mess because 6 has kicked 2
out.
So, when I access 2 again, it is going to
be a mess.
So, the bottom line here is that all of
these accesses are going to be a mess.
Even though I only using this one
position for cache, why is this.
Well we say that whats going on here is a
cache conflict.
Since two locations that a max is
repeatedly mapped to the same entry, the
only place they can go into the cache.
They going to kick each other out,
repeatedly.
So, I'm going to have a bunch of misses,
even though the rest of my cache could
potentially be empty.
And the way we solve this problem is
using something we call associativity.
And this is basically the question.
What if we could store data in any place
in the cache 'kay?
But, that could be possible, it could
make data go anywhere in the cache, but
that slow down caches, because the
sequence gets more complicated, 'kay?
So, I'm going to tell you why a little
bit later.
So, we do something in between, 'kay?
So, suppose that I.
Say that I have cache that has you know,
eight positions, eights blocks here.
There's multiple ways of organizing it.
'Kay?
One way is to use eight sets and have one
block per set, this will be called direct
mapped.
'Kay?
So this one I just showed you, direct
mapped.
now on the other extreem is fully, fully
associative which means the data can go
anywhere in the cache.
Okay?
So, direct mapped again, the data can
either go here, or here, or here and so
on.
Okay?
So, either go to one place only, not to
multiple places.
Whereas in, in fully associative, we can
go anywhere here.
So, since this one is bad, because of
conflict and this one is bad because it's
too expensive, let's see what we can do
in between.
There's something we call set associative
caches.
Yes, I could say if my cache is set two a
set associative, for example, I'm saying
that there are two places in a set where
data can go.
So for example, in, in our previous
example, I showed you that two addresses
would collide.
Right, they would both go to the same
place in the cache.
Now they can go to two places, they can
put address a here and address b here.
They can both live together.
Okay?
So, now and this has multiple flavors.
This is a two way.
If I say it's four way, there's four
locations in my set where data can go.
It means that I'm only going to have
conflicts in my cache if I'm going to, if
I have at least, if, if, if I have more
than four locations that I tend to access
repeatedly.
That maps into the same set okay?
An important question in such a
[INAUDIBLE], where does the data go and
let me tell with a very simple example.
First thing to know is that we're going
to divide an n bit address into three
components.
One is a number, a certain number of bits
to tell which part of the block the
address refers to, the offset within the
block.
Then we're going to have some bits here,
k bits, to tell me which index, which
set, the data goes in the cache.
And then whatever is left I'm going to
use a sort of tag.
Remember the tag's necessary to tell,
where the data what data's actually
stored in the cache.
So if I have an, in my example here,
suppose that I have a four byte four
block cache.
And then I have one I, I have two bytes
per block.
How many bits do I need for the offset?
Well, only one bit, because there's only
two bytes within a block.
Okay?
Now let's say that our cache is direct
mapped.
Direct mapped.
If I said that I have four blocks, how
many bits do I need?
Well, I need two bits for the index.
If I have a four bit address, how many
bits have I for the tag?
Just one.
Simple enough, right?
So, let's see more examples in
set-associative caches.
That was in the direct map cache, 'kay?
So, suppose that I have a different
address here now, I have the address
0x1833, okay?
The block size now is 16 bytes, it's a
more realistic block size than just two
bytes, okay?
And first thing is, 1833 in [UNKNOWN], in
binary is something, something,
something, something.
0110011, okay?
So, how many, how many bits do I need for
the block offset?
Well, if I have 16 bytes in my block, I'm
going to need four bits in my offset,
okay?
So now, how many bits do I need for the
index?
Well.
It depends on the associativity.
If I have, say, if I have one way
associativity, that means I'm going to
have eight different sets.
If I have eight, eight different sets, I
need three bits for the index.
Now, if only have four sets, I need only
two bits for the index.
And if I have four way associativity, I'm
going to need only two sets, that means I
need only one bit for for the index.
And whatever is left has to be used for
the tag.
Okay.
[BLANK_AUDIO].
So now, this at, let's, let's think about
the first one, so here, okay, so the
first one is k equals 3 means 3 bits for
the, for the index.
If I have, if this is my offset, because
of four bit offset.
And see these three bits here have to be
part of the index.
So, that's why 011 is three, okay.
But now, if I'm talking about k equals
two, I'm going to use only these two
bits, okay.
Which is, again, also three.
So that's where it goes in my four, in my
two way, associative cache.
Now four way associative cache you have
only one bits only this one bit here.
okay?
is going to be used to determine what
index, what, what set or what index what
is the value of the index in other words
what set the data goes.
Lets look about block replacement now.
Okay.
So we know that multiple locations in
memory can map to the same location in
the cache, therefore, in in a direct
associates, in a direct map cache.
That means that, see, if you put a piece
of data A here, and later, I want this, I
want to access B that goes to the same
place, I have to kick out A to store B.
There's only one location, always, so
that's an easy choice in what to replace,
there's no choice, you're just going to
replace it, right?
But, if I have a two way sets-associative
cache that's a two place I could put
address a here, it just be here, suppose
that they map to, to the same set, 'kay?
But now the question is, what happens
when I access C, in, C also happens to be
mapped to the same set.
I have to choose whether I, I, kill, I, I
remove A or I remove B.
The way caches do this normally is to
replace what was least recently used.
Because that maximizes temporal locality.
Do whatever's most-recently-used should
stay in the cache.
Okay, so that means the least recently
used zone that was used furthest in the
past should be removed from the cache.
So, lets see that the x is over here was
our x-axis one, first and then B later.
Okay, so B was accessed more recently,
means that when I put, when I access C,
I'm going to kick A out and out C here.
Because A was accessed, further, furthest
in the past.
OK.
So, implementing exact least recently
used in hardware for set sizes, that are
larger than two, is pretty hard, it uses
an approximation called not most recently
used.
You can read more about this online or in
the book.
Okay, so lets end this video with another
puzzle which might help you with
assignments too, okay?
Suppose that the cache starts empty,
okay?
And again, we have the, the same axis
team I showed before.
The same notation, address, whether it
was a hit or a miss, okay.
Suppose that I access 10 and it's a miss,
I access 12 and it's also a miss.
But then when I access 10, happens to be
a miss again.
Well, I just accessed 10 right here, so
what happened?
Well, probably happened that, well
probably happened for sure.
When access 12 I kick 10 out.
So taht means that 10 and 12 map to the
same so mapped to the same set?
And 12 displaced 10.
But since the cache was empty, that means
that for sure this has to be a direct
mapped cache.
It's cool isn't it?
Have fun with this and have fun with the
assignment.
Wyszukiwarka
Podobne podstrony:
dysleksja organizacja pomocy w szkolechemia organiczna200 Notatki organizacyjne20 Organizacja usług dodatkowych w zakładzie hotelarskimElementy wymagan organizacyjneNie wspierajcie ekumenicznej organizacji Open Doors!Elementy struktury organizacyjnej i zarządzanie projektowaniem organizacjiActive Directory omówienie domyślnych jednostek organizacyjnychskały charakterystyka (folie) 2 skały pochodz organicznegoOrganizm człowiekaOrganizacja i zarzadzanie 01Umowa YouTube z żydowską masońską organizacją o kontroli internetuwięcej podobnych podstron