It's time to move on to another section
of our course.
this time we're going to talk about
arrays and structs.
Namely, how we represent some more
interesting data structures in our
memory.
so we're going to move beyond just simple
integers and floats and characters but
talk about arrays of values and and
complex structures that involve multiple
elements.
that's what we call structs in the C
language.
Okay, so to outline this section.
we're going to start off on arrays and
how we access them in memory.
the various elements of the arrays.
And then we'll talk about
multi-dimensional arrays, as well as
multi-level arrays and the differences
between the two.
then we'll move on to talk about other
kinds of data structures and memory.
And how they're aligned to to the
appropriate addresses to to optimize
memory access.
Okay, so let's get started.
We'll begin with arrays, and the basic
idea behind arrays is that they have
elements of a certain type.
In this case, using the, the T of course,
the array has a name A.
And then it has a number of elements or a
length.
And this, this leads to a contiguously
allocated region of memory that is of
size N times the size of that data type.
So if these were ints it would be four
times n in bytes.
because ints are 4 bytes each.
So let's take a look at some examples.
Here's a character array called string
that has 12 elements.
And you'll notice that each is one byte.
And they're allocated one right after the
other in memory.
here we have an array of ints with five
elements called val and of course, it
take up 20 bytes in memory from some
starting address.
a double array of course the elements are
8 bytes each.
But they're always contiguous.
Okay?
Always right next to each other in
memory.
And of course for pointers we have a
difference depending on, whether we're on
a 32 bit or 64 bit architecture.
In in one case the pointers are eight 4
bytes, in the other case they are 8
bytes, the size of the addresses.
Okay, so continuing with this let's take
a look at a little example here we have a
five element array called val.
And it has the following integers stored
in it, 9,8,1,9,5.
That happens to be the zip code for the
University of Washington.
Okay let's take a look at how we might
reference these values.
We'll start off with an example of val
sub four.
so we are actually looking here for the
fifth element of the array, because the
indices start at zero, okay.
So although, here we put a five for the
total number, the actual index of the
last element of the array is n minus 1
val sub 4.
And its type of course, is going to be an
int because these are all ints in this
array.
So, the value in this case is going to be
the number five that last five that we
see at the address x plus 16.
the next example, just using the word
val, the name of the array, is going to
give us an address, an address to the
starting point of the array.
So that'll be the value x, wherever we
happen to start the array in memory.
Val plus one is also a pointer because
val is an address when we add 1 to it, we
do pointer arithmetic.
these are of type int, so we will add 4
to x, and so that value is x plus 4.
continuing on we have the address of the
second element of the array.
The second index of the array rather, or
the third element, and that would be x
plus 8, okay?
again because we start the indices at
zero.
Val five, now val five is not really in
our array, it's outside the bounds of our
array.
The last index should have been four.
But C doesn't care about that, it'll just
do the address computation the same old
way and just go to that memory location.
So in this case, it'll take the starting
address of val multiplied at 5 by 4 and
add 20 to that starting address.
So it'll really go to whatever memory
location is after the end of the array
and who knows what is there.
So we have an unknown value.
It's totally dependent on whatever is in
that memory location.
Okay let's take a look at the next
example, we're doing val plus one inside
the parenthesis, that's again that
address x plus 4.
And then, we're differencing it with the
star operator, so we're saying what is
the value at x plus 4.
And of course the value at x plus 4 is
the integer 8.
And then finally our last example val
plus i, an arbitrary index again that is
multiplied by 4 to yield the address x
plus 4 times i.
All right, so that's pretty
straightforward.
Let's take a look at a larger example.
Here we've defined three arrays, with
five values each.
You'll notice that the type of the arrays
is this zip dig for zip digit.
And that's a type that was defined in c
as b an integer array of five elements.
Okay, so that's just so that we have a
shorthand for defining a new data type
that then we can just use to declare
values.
So we have these three arrays of five
elements each.
They're represented in memory to look
like this and here I've shown them
actually one right after the other.
You'll notice that address is 16 36 and
56 just following each other in memory.
Of course, there's no guarantee that that
will actually happen.
That's just how I drew out this example,
okay?
So let's take a look at some address
computations on this array.
here we have a simple C procedure called
get digit, okay, it takes two arguments
in array and the index of the digit we
want in that array.
So something like uw 3 for the 3rd digit
of the uw zip code.
So the way we go about figuring out what
that value is, is we just take the name
of the array, and index it with the other
parameter, the other argument z sub dig.
In in assembly code this turns into a
very simple expression.
you'll notice here that we have just a
simple move instruction for for doing
that.
It's moving a value into the registered
eax because that will be the return value
of this procedure.
And what does it return?
Well, it's going to do an address
computation.
It's going to use what is currently an
eax and we're assuming that's the value
of the second parameter digit.
It's going to multiply that by 4, so 4
times digit.
Okay, and then add that to edx which is
the value of the first parameter, the
address of the, the first starting
address of the array Z.
Okay, so that is the address.
And then, the, the parentheses around it
imply a de-reference that we're going to
go to that location, and get the value
there and put it into eax.
Let's take a look at some other examples
of accessing these arrays.
Let's take a look at uw sub three.
What would we expect that to be?
Well, we expect to go to the uw array
here, okay.
And sub 3 means the fourth element so
it's not the nine, not the eight, not the
nine but it is the second nine at address
of 48.
Okay and in fact we can compute that by
taking the starting address of the array
36 and adding 4 times 3 the size of an
int times the index.
Okay, the value there of course is nine
and that is guaranteed to be that value,
in terms of it's organization and memory.
Let's try uw6, now of course, that is off
the end of our array.
We don't have a seven element array.
We have a five element array.
But the address computation proceeds as
before, this time we get an address of
60, which happens to have the value four
in it.
But you'll notice that it's really part
of the ucb address the ucb array.
And, of course, there is no guarantee
that that array is in the location that
I've shown it to be in.
It could have been allocated anywhere in
memory, so that's absolutely not a
guaranteed result.
In fact anytime that we go outside the
bounds of the array, we have no
guarantees as to what the value will be.
Let's do uw minus one.
That's an interesting index, but the
address computation just uses it the same
way this time we get an address of 32.
the value there is three in the cmu zip
code array and again that is not
guaranteed because it's dependent on the
arrangement of these in memory.
Similarly with cmu 15 we're now going off
to a part of memory I haven't even drawn.
So, who knows what's there.
And of course, there is no guarantee.
Alright, so the thing to remember is that
there is no bounce checking in C and the
location of each separate array in memory
is not something that is guaranteed by
this, by the run time system.
Let's go on to look at a slightly more
complex example.
In this case we have a function zd2int
which takes an array of type zip digits
and turns it into an integer.
So, for example the uw array that had
9,8,1,9,5 as the five digits would now if
given as an argument to this function,
would return the integer 98,195.
How do we do that?
Well, it's a simple loop that goes and
looks at each digit, multiplying it by 10
each time.
so that we use the digits as a ten
thousandths column, thousandths column,
hundredths column, tens column and so on.
So the way we're going to do that is by
getting that first digit.
and accumulating a result in a variable.
So you'll see in the first case when the
loop runs here with I equal to zero.
We'll add, we'll index z sub zero, the
first digit.
Okay, and we will add that to ten times
the zi, which initially is also zero.
So the result here would be zi equals 9
if we are doing the uw array.
Okay the next time around of course we'll
multiply this 9 by 10, so it'll be 90
plus the second digit.
Which will now have the index z sub 1.
And we'll see an eight here so we'll get
98.
Alright, and then the next time around
it'll will be 98 times 10 plus the next
digit, which is of course a 1 to yield
981 and so on.
You get the idea.
Alright, so how does this look in
assembly code?
What we see here is that we've had our
compiler do one important transformation,
actually a couple.
But the most significant one is that
rather than using this loop index and
doing an address computation each time,
what it's done is it's taken the starting
address of the array and then just adds
to that as we go down each element to the
array.
And by adding 1, using the construct z++,
nc to indicate an increment by one, it's
essentially adding four each time because
they're ints.
Okay, and so the other transformation is
it's taken our four loop, and transformed
it into a do loop with the test at the
end to see whether we've reached the last
address of the array.
Okay, so that's our our transform C code.
And now, we can go on to see the actual
assembly code that's generated.
So, I've listed here the, where the
variables are stored, in which registers
and the computations that need to be
done.
And you'll notice that the main
computation is 10 times a digit plus the
next digit 10 times the current
accumulated value plus the next digit and
that's been a factored a little bit to
look like that instead and let's see why
that's happening.
Alright, so our first instruction
basically just clears a a register so it
can represent that starting accumulation
of our result.
then the next thing we do is compute the
end of the array since we had a fixed
size loop that went around five times we
know that the end of the array will be
four elements down so we just do z plus 4
as the end of the address.
And that gets translated to of course, 16
because of the size of int.
All right, the next thing that we do is
start doing some arithmetic to do our
computation in the body of the loop.
And you'll notice here that first we use
an effective address instruction to
compute 5 times zi.
That's convenient to do, because we can
do a power of 2 times a value plus the
value again, which gives us 5 times the
original value, alright?
And then we get the value of the next
digit by differencing the address Z.
remember that star Z is represented by
those parentheses around the value in the
register ecx.
So it says to dereference that as an
address.
And then we add those together but
further multiplying that five times zi by
another two.
And the way we do that is 2 times that
value plus the star z and take that
result and put it into eax.
Okay, so we're ready for the next time
around the loop.
Of course, we then do the test of the
loop comparing whether we're at the last
address.
And if we are, we continue on, and if
we're not, we loop back around to this
label L59.
All right, so that is the assembly code,
implementation of, of this.