[MUSIC].
Before we close this section on
procedures and stacks let's talk about
how things change when we go to the
64-bit architecture as popular today.
So the calling convention in x86, 64-bit
architectures is a little different, and
that's because of the doubling of the
number of general purpose registers.
There's so many more registers available
on the 64-bit architectures that we can
decrease our use of the stack.
And make better use of the registers.
So, we're going to store arguments in in
registers and we're going to store
temporary variables in registers.
Of course, we could always run out of
registers and we'll fall back to the way
we did things in the 32-bit architectures
we've just seen.
But for the most part, we're going to try
make use of those registers as best we
can and avoid the use of memory.
so let's take a look at the registers in
the 64-bit architecture again.
there are now 16 general purpose
registers, and they are 8 bytes each
rather than 4 bytes because we have code
words instead of 4 byte words.
We also going to extend our Callee and
Callee saved and Caller save conventions.
And you can see the registers that are
annotated here in green being the Callee
saved registers.
And in yellow are the caller saved
registers.
also we're going to use six registers, in
these locations for passing arguments.
And we're going to us up to six registers
to take care of six arguments for
procedure.
Now, if we have more than six arguments,
we'll have to go back to using the stack.
But for the most part, we'll use these
and most procedures have just a couple of
parameters.
So most of the time, we won't have to use
the stack.
we're still going to use rax for the
return value of a procedure and we're
still going to use rsp for a stack point.
Okay, let's revisit the swap function and
look at it in both the 32-bit
architecture which we see here on the
left.
This is the one we've seen so far, and a
swap implementation using 64 bits, okay.
And the differences in these two cases
are that arguments are passed in
registers.
So the first argument is now in the
register rdi the second in the register
rsi, and that's where we find the 64-bit
pointer, those two arguments to swap.
so we're not having to get them off the
stack.
So the only stock operation we really
need is return.
that goes against the return value from
the stock and jumps to that location when
we're done.
By avoiding the stack, and holding all
the local information in registers we can
make execution much faster one because we
have less instructions.
As you can see we have quite a quite less
instructions for the 64-bit version.
But also because we're not going to
memory.
And the stack is stored in memory.
And that is slower to get to than the
registers of the CPU.
The general purpose registers.
We'll learn more about that later when we
talk about the memory system.
but for now suffice it say that it's a
lot faster to go to registers, okay.
So the highlights of this then for the
64-bit case, are that arguments up to the
first six are stored in registers.
It's faster to get to those values there
than if they were in memory in the stack.
Local variables can also be placed in
registers if there's room.and we don't
have too many of them.
Otherwise, we will have to go back to the
stack.
We have a callq instruction now instead
of a call instruction, which puts a
64-bit return address on the stack.
And of course it will have to increm-,
decrement the stack pointer by 8, rather
than 4 because were putting 8 bytes on
the stack.
we also have eliminated the use of the
frame pointer, remember ebp, our base
frame pointer.
we're not going to do that anymore and
we're going to make all reference
relative to the stack pointer, so we
won't have to keep track of two registers
pointing to stack, but only one.
And epb or now its 64 bit version, rbp is
going to be available as a general
purpose register.
And then the way, the reason that works
is because we can access memory up to 128
bytes of beyond the rsp where rsp is
pointing without having to use multiple
instructions.
We can do that directly.
This is called the red zone, okay?
And so we can store these temporary
variables on this stack very easily and
access them quickly.
registers are still designated as
caller-saved or callee-saved however, but
slightly differently than they were
before.
Okay, so ideally the 64 bit architecture
has no stack frame at all, except for the
return address.
So we've now shrunk the stack frame down
to just one piece of information namely
that that it might return address that is
placed on the stack.
this makes things a lot simply to for
manipulating stack and keeping the,
making the frames that we need.
However, we always have to fall back to
the 32-bit architecture conventions if we
can't fit things in registers.
And that's why we bothered to show you
all that 32-bit stack convention even
though we're mostly running on 64-bit
architectures these days.
Because when we have too many local
variables we have to go to the stack.
when local variables are more complex
data structures like arrays or struts,
we'll have to put things on the stack.
When we have an address for local
variable, we'll have to put it on the
stack, because we can't have an address
to a register we have to have an address
to a memory location.
so we will have to put it on the stack.
And whenever we need more than six
arguments to a function we'll need to
stack again.
and of course saving registers away that
also will potentially have to have us use
the stack.
So, we still need stack frames and it's
still important to understand the general
case.
But to keep in mind that most of the time
on 64-bit architectures, that stack frame
is tiny.
It's just a return address.
All right, let's take a look at an
example that that illustrates this.
we're going to have this function called
call proc, which does some has some four
local variables of different sizes.
and, then does a call to another
procedure called proc, and then finally
returns a value that it computes
according to this expression.
Okay.
So, the, the way call proc is going to
start, of course, is its stack pointer is
pointing to where it has to return and
whatever procedure called it.
that's, the top of the stack.
And the first thing that it's going to do
is allocate 32 bytes on the stack for the
local variables that it will need.
And you'll notice that by adjusting the
pointer down to 32, the stack pointer to
now down here 32 is 4 times 8 bytes so
four 8 byte words.
that's why I've drawn it as four
horizontal sections of memory.
Each of those is 8 bytes.
And we're going to allocate the four
temporary variables, x1 though x4 to
these areas here.
And you'll notice that x1 occupies 8
bytes.
It's a long integer.
X2 is just a regular integer, only needs
four bytes.
X3 is a short int which only needs two,
and x4 is a single byte, okay.
now why did we allocate two more words,
well we're going to see, we're going to
need those because this procedure call
here has eight arguments more than the
six we can do with registers.
So we're going to need two places to go
put those two extra arguments for our
procedure call.
All right.
So, let's see what the what the first
instructions of the function are.
As I mentioned, we adjusted the stack
pointer by 32 to create that space.
and then we moved four values into
different locations on the stack.
And you'll notice that we used offset to
the current stack pointer to find the
right places to put them.
We put the 8 byte quantity, the quad word
that was value 1 for x1 at 16 plus the
stack point, that's at this location.
then we moved a long word, value 2, to 24
plus the stack pointer.
That's at this location.
then a word or rather 16 bits value three
at 28 plus the stack pointer.
Well, 24 was here, four more over puts us
here at x3.
And then finally, a single byte of value
4 at 31 plus the stack pointer.
That's that 24 here, and then 7 over puts
us over where we've labelled x4 as byte,
that single byte, okay.
let's move on now to setting up the
parameters, the arguments for calling the
function proc.
OK, that's the next part of this of this
procedure.
And, what we see here is a set of
instructions, that Set things up for all
those arguments.
Now, arguments have passed in a
particular order in the registers.
The first argument has to go into rdi.
The second into rsi.
The third into rdx.
Rcx, r8, r9 until we're, we got six per
six arguments.
The rest are going to go on the stack.
Okay?
And that means two more will have to go
on the stack.
in this case because we have eight
arguments.
So let's take a look at the first
instruction.
It moves a quad word with value 1 to rdi.
That's the equivalent of putting that x1
there as the first argument.
then we are going to need the address of
x1.
Well, the address of x1 on the stack is
here at 16 from the rsp.
So you'll notice that we'll calculate
that effective address and put it into
rsi, the second argument.
Then we'll put a value 2 into edx for the
third argument and the address of that
value, which is a 24-plus rsp, into rcx,
that, that address, for x2.
Then we will move a 3 int r8d, that's
just the 4 bytes, the low order 4 bytes
above rate that's how we referred with
and then put that address into the
address of x3 into r9, r6 argument.
And 28 plus rsp is the address of this
byte right here.
Okay.
Lastly we will move for into where the
rsp is pointing right now.
Remember, the parenthesis are the
preference for that and that's argument
number 7, put on through the current,
onto the stack, at that location.
And then the last argument, argument 8 is
the address of x4.
And the address of x4 can be computed by
doing 31 plus the rsp.
We're going to put that in rax
temporarily, just so that we can then
move it to 8 plus the rsp, the slot for
the 8th argument.
Okay, so now we've set up all 8 argument,
6 in registers, 2 on the stack.
And we're ready to call_proc.
At this point of course a new return
address gets pushed onto the stack that
will help us come back to this point in
this procedure call_proc after we're done
with the proc call.
Okay, once that's completed, we will be
back here and now have to do that
computation to figure out the return
value.
So how do we do that computation?
Well, what we're going to do is make sure
we carefully get the the values of our,
of our temporary variables in this
procedure.
and put them into registers with
appropriate sign extension.
So, we're going to be using these
interesting instructions here, that say,
move the s stands for extending the sign
bit, of the word, into the long.
Alright, so we're taking a 16-bit
quantity, the word, sign extending it to
32-bits, the long.
Okay?
That's what the l refers to.
And the ss sign extend.
Another option is z, for just put 0s
there in the other 16 bits.
But this says, do the sign extension.
And the result goes into the 32-bit
register, eax.
We'll do the same thing for now.
A byte extended to a long.
to get the value of x4.
And put that in edx.
Sign extended.
And then subtract that from eax.
So this will have computed x3 minus x4.
Now of course thats a 32 bit quantity and
we going to have to multiply with some 64
bit quantity.
So we going to use some more sign
extension using the cl.
T Q instruction.
That sign extends the 32-bit eax register
to 64 bits.
The next part computes, as you can
imagine, x1 plus x2, and it does that by
getting the, value of x1, moving it into
rdx.
Or rather the value of x2, moving it into
rdx, and sign extending it to 64 bits, in
this case, from a long to a quad.
And then, adds the, already 64-bit value
of x1.
also to rdx, so here we will have x1 plus
x2 now as the result.
Finally, we take those two registers rax
and rdx and do a multiply instruction to
compute the final result.
Okay.
The result is placed into rax, ready to
be returned.
Remember, we put the return value in the
rax register.
The last thing we need to do before
executing our return statement, is clean
up the stack, and get rid of the space we
allocated.
while we were in this procedure, and to
do that, we add 32 to the stack pointer.
the opposite of, the subtract 32, that we
did at the beginning.
Okay, so now we are exactly back to the
stack that looks just like it did before.
And, we are ready to execute the return
instruction that will take us back to
whatever called.
called proc in the first place.
So, to summarize, the 64-bit
architectures make a heavy use of
registers because they're faster than
using the stack in memory.
We use them for parameter passing and we
use them for temporary variables.
Okay, so there's a minimal use of the
stack.
sometimes, oftentimes actually, we don't
use the stack at all except for the
return address of the function.
And but when needed, when we need the
space there, either for those arguments
or for more temporary variables, we
allocate and deallocate the entire frame
in one, at one time.
It's just faster than doing multiple
pushes and pops.
So we don't, also don't bother with a
frame pointer anymore and address
everything relative to the stack pointer,
as we saw in that previous example, okay.
This also creates a lot more room for
compiler optimizations that can play with
registers and how we use them in, best
make use of them.
so we don't have to have collisions that
would cause us to have to save registers
and so on.
All right?
So that's that ends this section and I
hope it provided a good overview of
procedure call conventions.
Both the 32 and 64 bit.
remember that we, although we make
minimal use of stack frames in the 64-bit
architecture.
We often have to fall back to the general
case, which we saw with the 32-bit,
conventions.
Wyszukiwarka
Podobne podstrony:
06 x86 64 Procedures and Stacks02 x86 vs x86 6402 Procedure?lls and Returns06 Memory Related Perils and Pitfalls06 Memory Related Perils and Pitfalls2008 01 Music Makers Tuning Up with the 64 Studio and Jad Audio Linux Distros02 x86 vs x86 64SHSpec 06 6402C25 What Auditing Is and What It Isn t06?TECT AND FILTERING OF HARMONICS01 Stacks in Memory and Stack Operationsduties and proceduresSHSpec 025 6107C05 Q and A Period Procedures in Auditing06 User Guide for Artlantis Studio and Artlantis Render Export Add onswięcej podobnych podstron