plik

It's time to move on to another section of our course. this time we're going to talk about arrays and structs. Namely, how we represent some more interesting data structures in our memory. so we're going to move beyond just simple integers and floats and characters but talk about arrays of values and and complex structures that involve multiple elements. that's what we call structs in the C language. Okay, so to outline this section. we're going to start off on arrays and how we access them in memory. the various elements of the arrays. And then we'll talk about multi-dimensional arrays, as well as multi-level arrays and the differences between the two. then we'll move on to talk about other kinds of data structures and memory. And how they're aligned to to the appropriate addresses to to optimize memory access. Okay, so let's get started. We'll begin with arrays, and the basic idea behind arrays is that they have elements of a certain type. In this case, using the, the T of course, the array has a name A. And then it has a number of elements or a length. And this, this leads to a contiguously allocated region of memory that is of size N times the size of that data type. So if these were ints it would be four times n in bytes. because ints are 4 bytes each. So let's take a look at some examples. Here's a character array called string that has 12 elements. And you'll notice that each is one byte. And they're allocated one right after the other in memory. here we have an array of ints with five elements called val and of course, it take up 20 bytes in memory from some starting address. a double array of course the elements are 8 bytes each. But they're always contiguous. Okay? Always right next to each other in memory. And of course for pointers we have a difference depending on, whether we're on a 32 bit or 64 bit architecture. In in one case the pointers are eight 4 bytes, in the other case they are 8 bytes, the size of the addresses. Okay, so continuing with this let's take a look at a little example here we have a five element array called val. And it has the following integers stored in it, 9,8,1,9,5. That happens to be the zip code for the University of Washington. Okay let's take a look at how we might reference these values. We'll start off with an example of val sub four. so we are actually looking here for the fifth element of the array, because the indices start at zero, okay. So although, here we put a five for the total number, the actual index of the last element of the array is n minus 1 val sub 4. And its type of course, is going to be an int because these are all ints in this array. So, the value in this case is going to be the number five that last five that we see at the address x plus 16. the next example, just using the word val, the name of the array, is going to give us an address, an address to the starting point of the array. So that'll be the value x, wherever we happen to start the array in memory. Val plus one is also a pointer because val is an address when we add 1 to it, we do pointer arithmetic. these are of type int, so we will add 4 to x, and so that value is x plus 4. continuing on we have the address of the second element of the array. The second index of the array rather, or the third element, and that would be x plus 8, okay? again because we start the indices at zero. Val five, now val five is not really in our array, it's outside the bounds of our array. The last index should have been four. But C doesn't care about that, it'll just do the address computation the same old way and just go to that memory location. So in this case, it'll take the starting address of val multiplied at 5 by 4 and add 20 to that starting address. So it'll really go to whatever memory location is after the end of the array and who knows what is there. So we have an unknown value. It's totally dependent on whatever is in that memory location. Okay let's take a look at the next example, we're doing val plus one inside the parenthesis, that's again that address x plus 4. And then, we're differencing it with the star operator, so we're saying what is the value at x plus 4. And of course the value at x plus 4 is the integer 8. And then finally our last example val plus i, an arbitrary index again that is multiplied by 4 to yield the address x plus 4 times i. All right, so that's pretty straightforward. Let's take a look at a larger example. Here we've defined three arrays, with five values each. You'll notice that the type of the arrays is this zip dig for zip digit. And that's a type that was defined in c as b an integer array of five elements. Okay, so that's just so that we have a shorthand for defining a new data type that then we can just use to declare values. So we have these three arrays of five elements each. They're represented in memory to look like this and here I've shown them actually one right after the other. You'll notice that address is 16 36 and 56 just following each other in memory. Of course, there's no guarantee that that will actually happen. That's just how I drew out this example, okay? So let's take a look at some address computations on this array. here we have a simple C procedure called get digit, okay, it takes two arguments in array and the index of the digit we want in that array. So something like uw 3 for the 3rd digit of the uw zip code. So the way we go about figuring out what that value is, is we just take the name of the array, and index it with the other parameter, the other argument z sub dig. In in assembly code this turns into a very simple expression. you'll notice here that we have just a simple move instruction for for doing that. It's moving a value into the registered eax because that will be the return value of this procedure. And what does it return? Well, it's going to do an address computation. It's going to use what is currently an eax and we're assuming that's the value of the second parameter digit. It's going to multiply that by 4, so 4 times digit. Okay, and then add that to edx which is the value of the first parameter, the address of the, the first starting address of the array Z. Okay, so that is the address. And then, the, the parentheses around it imply a de-reference that we're going to go to that location, and get the value there and put it into eax. Let's take a look at some other examples of accessing these arrays. Let's take a look at uw sub three. What would we expect that to be? Well, we expect to go to the uw array here, okay. And sub 3 means the fourth element so it's not the nine, not the eight, not the nine but it is the second nine at address of 48. Okay and in fact we can compute that by taking the starting address of the array 36 and adding 4 times 3 the size of an int times the index. Okay, the value there of course is nine and that is guaranteed to be that value, in terms of it's organization and memory. Let's try uw6, now of course, that is off the end of our array. We don't have a seven element array. We have a five element array. But the address computation proceeds as before, this time we get an address of 60, which happens to have the value four in it. But you'll notice that it's really part of the ucb address the ucb array. And, of course, there is no guarantee that that array is in the location that I've shown it to be in. It could have been allocated anywhere in memory, so that's absolutely not a guaranteed result. In fact anytime that we go outside the bounds of the array, we have no guarantees as to what the value will be. Let's do uw minus one. That's an interesting index, but the address computation just uses it the same way this time we get an address of 32. the value there is three in the cmu zip code array and again that is not guaranteed because it's dependent on the arrangement of these in memory. Similarly with cmu 15 we're now going off to a part of memory I haven't even drawn. So, who knows what's there. And of course, there is no guarantee. Alright, so the thing to remember is that there is no bounce checking in C and the location of each separate array in memory is not something that is guaranteed by this, by the run time system. Let's go on to look at a slightly more complex example. In this case we have a function zd2int which takes an array of type zip digits and turns it into an integer. So, for example the uw array that had 9,8,1,9,5 as the five digits would now if given as an argument to this function, would return the integer 98,195. How do we do that? Well, it's a simple loop that goes and looks at each digit, multiplying it by 10 each time. so that we use the digits as a ten thousandths column, thousandths column, hundredths column, tens column and so on. So the way we're going to do that is by getting that first digit. and accumulating a result in a variable. So you'll see in the first case when the loop runs here with I equal to zero. We'll add, we'll index z sub zero, the first digit. Okay, and we will add that to ten times the zi, which initially is also zero. So the result here would be zi equals 9 if we are doing the uw array. Okay the next time around of course we'll multiply this 9 by 10, so it'll be 90 plus the second digit. Which will now have the index z sub 1. And we'll see an eight here so we'll get 98. Alright, and then the next time around it'll will be 98 times 10 plus the next digit, which is of course a 1 to yield 981 and so on. You get the idea. Alright, so how does this look in assembly code? What we see here is that we've had our compiler do one important transformation, actually a couple. But the most significant one is that rather than using this loop index and doing an address computation each time, what it's done is it's taken the starting address of the array and then just adds to that as we go down each element to the array. And by adding 1, using the construct z++, nc to indicate an increment by one, it's essentially adding four each time because they're ints. Okay, and so the other transformation is it's taken our four loop, and transformed it into a do loop with the test at the end to see whether we've reached the last address of the array. Okay, so that's our our transform C code. And now, we can go on to see the actual assembly code that's generated. So, I've listed here the, where the variables are stored, in which registers and the computations that need to be done. And you'll notice that the main computation is 10 times a digit plus the next digit 10 times the current accumulated value plus the next digit and that's been a factored a little bit to look like that instead and let's see why that's happening. Alright, so our first instruction basically just clears a a register so it can represent that starting accumulation of our result. then the next thing we do is compute the end of the array since we had a fixed size loop that went around five times we know that the end of the array will be four elements down so we just do z plus 4 as the end of the address. And that gets translated to of course, 16 because of the size of int. All right, the next thing that we do is start doing some arithmetic to do our computation in the body of the loop. And you'll notice here that first we use an effective address instruction to compute 5 times zi. That's convenient to do, because we can do a power of 2 times a value plus the value again, which gives us 5 times the original value, alright? And then we get the value of the next digit by differencing the address Z. remember that star Z is represented by those parentheses around the value in the register ecx. So it says to dereference that as an address. And then we add those together but further multiplying that five times zi by another two. And the way we do that is 2 times that value plus the star z and take that result and put it into eax. Okay, so we're ready for the next time around the loop. Of course, we then do the test of the loop comparing whether we're at the last address. And if we are, we continue on, and if we're not, we loop back around to this label L59. All right, so that is the assembly code, implementation of, of this.

Wyszukiwarka