[MUSIC].
Hello again, everyone, so let's, let's
continue our, our section on basis of
architecture and machine programming.
First, let's start with a definition.
So what is ar-, what is architecture?
Well, architecture, which is the same
thing as instruction set architecture,
are the parts of the processor design
that one needs to understand to write
assembly code.
'Kay.
Remember, it's the harder software
interface, really.
It's the harder of the harder software
interface.
And another way of thinking about it,
it's what's directly visible to software.
If it's visible to software it means that
you have to know about it when writing
code.
'Kay.
Now the microarchitecture, on the other
hand, are details of how the architecture
is implemented.
This is more details of how the
instructions of that architecture is
implemented.
'Kay, so let me, let's just think through
some examples here.
So As we saw briefly, we are going to see
more detail later in the course, a cache
is a very, is a very fast piece of memory
that lies inside the processor.
It happens to hold recently accessed
data.
'Kay.
So now is cache size architecture?
Well, it's not directly visible to
software.
That means that it's not Part of it is
used by software but it's not directly
visible, the size is not directly
visible, so it's not part of the
architecture.
How about core frequency?
Well, core frequency is also not
something that software knows explicitly.
Software doesn't care all that much about
total frequency is.
The user does because it make the
processor faster.
But when writing Assembly code, you don't
know, and there's nothing in the ISA that
refers to that.
So this also what's what, what's core
frequency, core frequency is part of the
micro Architecture, or the
implementation.
'Kay?
So and the number of registers, well, the
number of registers is something that you
need to know, because you're going to use
registers explicitly in our assembly
code.
So number of registers is part of the
ISA, therefore is really part of the
architecture.
'Kay.
So now, with that in mind, let's, let's
look at assembly program where its view
of you system 'kay?
Here we have the CPU the central
processing unit, and there're 3 important
things.
That an assembly programmer needs to
know.
First, there's this special register,
called the program counter, 'kay, and
program counter holds the address, it
holds the address of the structure of
what it's going to execute next.
One instruction after the other.
And the program counter tells which
instructions going to be executed next.
'Kay?
By the way, it's also called EIP in I-32,
and IP stands for Instruction Pointer,
and E's called Extended Instruction
Pointer.
And Now,
In R, RIP.
IP, in this case, also means instruction
point in x86-64.
'Kay.
'Cuz it's just names for the program
counter.
now, there's also a bunch of registers
that are just used by the assembly code
directly.
And by the way, these these registers to
hold data.
Essentially, a piece of very very very
fast memory that is, that is addressed.
explicitly by your assembly codes.
'Kay?
And now, we also have, condition codes.
In condition codes, it's,
It essentially tells you the result of
some, of the status of some operations
that was just executed by the
instructions.
For example it tells you whether or not
there was an overflow.
'Kay.
It also tells you what, if you're wanting
to do a comparison.
So you're going to compare whether a
system registers a lower value than
another register.
The result of this comparison is set as a
condition code.
'Kay.
So now in now on this side here we have
memory.
And memory, well memory is meant to hold
data and codes.
That includes the operating system as
well.
Also, there's a piece of specialty data
structure called a stack.
Which we're going to see later in the
class.
It's very very important for program
execution.
Now, the bottom line here is, that we
look here, we have the CPU, here we have
memory, the CPU sends adresses to memory
when it wants to read data into
registers.
In the memory, gets data, sends data back
to the CPU that puts it in registers.
'Kay?
Orders an operation with it.
Now the CPU also can sense data to
memory.
It can get data from registers and say
like, hey memory store this data for me.
'Kay?
Now another thing that's very, very
important is that instructions themselves
like object Code also known as object
code, comes from memory.
It's stored in memory, so that means that
the processor has to fetch these
instructions from memory in order to
execute them, 'kay?
And memory is just an array of bytes that
has, you know, and each byte has an
address, essentially a table, a very
large table of bytes, where you can
access them directly.
'Kay.
So now let's look at how we turn C code,
high level C code into Object code, 'kay,
into machine code.
So here are the steps.
So in this example here we have, you
know, two source code files, these are .C
files that have codes Code or text
written in the C language 'kay.
Now you are going to use a compiler, GCC
is a popular source compiler that that
we're going to use in this class, and
it's very, it's used by a number of
people.
Now you are going to use a compiler, GCC
is a popular source compiler that that
we're going to use in this class, and
it's very, it's used by a number of
people.
And what this compiler is doing, it's
taking two C files As input, it's going
to compile them and produce a single
object file, that is what the minus O is
doing here.
So, test the output, produce a binary
file p that holds the resulting machine
code from the source code.
So now going to the steps we start with
what we call the text.
That's the text of your program, the
source code.
You run through a compiler.
'Kay.
And if you use the minus S option.
'Kay.
So you're going to produce dot S files
that hold the assembly code generated by
the compiler.
'Kay?
Now, you can.
Now you're going to use the assembler.
We saw this before.
The assembler takes assembly code, and
produces binary or machine code, 'kay?
And what it's doing here is, it's getting
this, this example here.
Getting the, the, the, the p1.s and p2.s.
And translating them into binary machine
code equivalent to the input assembly
code.
'kay.
So, in the final step there is something
called the linker, that what it does is
it gets the object code, it has some
symbolic names in them and links with
libraries.
And what are libraries?
Libraries are essentially a collection of
precompiled code that's available for use
in user code.
For example, things like to print.
Print values on the screen to access
operating system services and so on.
this all resides in libraries.
And the linker gets together the object
program, links links it with the
libraries and produces a final executable
program, that what you can type and and
run in the comment prompt.
'Kay?
And in this case here, we're also using
an optimization.
This minus a 1 here is just telling the
couple, hey apply some optimization to
our code when generating it.
'Kay?
so, let me, let's, let's see one example
of, of alternazation from a C to
assembly.
Here we have a C code very simple code
sum.
That takes two integers as, as input.
perform x and y, adds them, puts in a
variable called t, and then returns t.
'Kay?
So it means when you execute a sum and
pass parameters it's going to return the
sum of those two values.
And that's the, sum is the name of the
function, and now when we generate our
assembly code what we have here the name
of the function, and colon.
Saying that the function starts there.
And we're going to see this in detail
more but you know, this instructions here
move data between two, this instruction
is moving data betwen two registers.
This one is reading data from memory,
more specifically it's reading one of the
parameters.
And this, this is where we're actually
doing the addition with the other
parameter.
And and now we, we are positioned tor
result in this return instruction here
just returns from the execution of your
function.
'Kay, it goes back to the caller of the
function.
'Kay.
And the way you would obtain this, I
mentioned this briefly briefly before, is
the compiler, minus S, tells the compiler
to produce assembly code.
When you give a .c file it is going to
produce an equivalent file that is now a
.s to host the assembly code.
Great.
So, now, there's three basic kinds of
assembly instructions, 'kay?
There are instructions that, that perform
arithmetic.
Things like addition, substraction, and
so on.
Both on register data as well as memory
data.
And there's one type, the other type of
instructions that, instructions that move
data between memory and the register,
because they could load data from memory
into a register, 'kay.
There's also instructions that get data
from a register and stores it into
memory.
'Kay, that's the second category.
So, transfer between memory, data
transfer between memory and registers.
And finally the third category of, the
third basic category of instructions is
ones that control, that transfer control.
And control here means what instruction's
being executed at any given time.
So the flow of control is the flow of
instructions that are being executed.
'Kay?
This is when, when you do a jump,
essential jump, you execute your program
here[SOUND].
'Kay?
So some of you can just Jump at a
different part of your program and you
can jump back and continue executing.
'Kay?
So and this is very useful, because when
you have something like an if in your
program, if something that's actually
making a decision of whether or not you
should jump to a different part of your
code.
OKay.
So, with that in mind now that we know
the basic types of instructions, let's
look at the basic data types in Assembly.
So now for Assembly, you essentially have
words, you have units of data that it
would be one, two, or four bytes, in the
case of I32.
Or also eight bytes in the, in the 64 in
the, in the 64 bit ISA case.
'Kay.
So in this integer data holds both data
values like really program data in the
form of integers, as well as addresses.
And note that data in assembly is
essentially untyped.
OKay so, so that means it is a register,
could hold, an integer of a certain width
or it could also hold a register.
The registry doesn't really know it
depends on how you use it, that's going
to depend on whether it's going to be
seen as an address or as a data value.
So the other data type most ISA's today
support floating point operations in the
ISA.
So that means now, in floating point it
is how things special registers, in the,
in the processor.
Means that implicitly there is a data
type you know a floating point is a data
type that is that programmer writing
assembly has to be aware of.
Is, now what about aggregate data types,
aggregate data types in assembly?
Is there such a thing?
No, in fact, for assembly everything is
just a continuously, it's no, memory is
flat, and there's no notion of arrays.
You have to implement it by hand, by
laying out the data and memory properly
and then accessing it in a way that makes
it look like an array.
OKay.
So we talked about object code before.
Remember that the assembler translates
assembly codes into object code, binary
codes.
And here's an example.
Like, remember the sum function that we
just saw that adds two numbers?
Well, these are the bytes that compose
the machine code for the sum function.
'Kay?
Cause that suppose to be total of 13
bytes, and there's a bunch of
instructions encoded there.
you know, some instructions take only 1
bytes, some other instructions take 2
bytes, other takes 3 bytes and so on.
And this is also saying that where the,
the, the, the code is located in memory
starts at address, at this address here.
Ok, this is where it starts and this is
also saying that it's first, the address
of the first byte here of this function.
Okay?
So, as you see, it's actually not obvious
where the instructions start.
And then, because they have different
sizes, and you have to decode them in
order to understand what they do.
'Kay?
So now I mentioned this before briefly
that what the linker does is essentially
it resolves some symbolic names that are
embedded into your object code.
And also links what's missing like if
your code happens to use some system
library functions like malloc to allocate
memory or print stuff on the screen.
this has to be linked, you know, your
code has to point to the right place in
this library.
All right so let me show you an example
of a machine instruction here again.
So remember that in our sum function we
had this statement here, int t equals x
plus y.
Which just adds, x and y, and put into
variable t.
Well there's an add here so it's only
natural that in our assembly code there
will be something that looks like an add.
And that's the instructions in the middle
by the compiler to implement that that
expression.
'Kay?
So the C Code here, we are trying to add
two integers and an assembly code.
What we are doing is we're adding two
four-byte intergers.
'Kay.
So, in by the way, note that is the same
instruction whether it's signed or
unsigned.
'Kay, so Gaetano told you about signed
and unsigned integers, right?
So, you, you know about that now but the
instructions doesn't know whether it's
adding a, a signed number or not.
So, now in.
So, here are the operators of our
operation.
So, we are doing this addition here.
Right, x plus y.
Now x, has to be somewhere.
And in this case it happens to be stored
in a register called eax.
That's one of the registers offered in
the x86 ISA.
And that's why it's on the perimeters in
this function here.
In, in, in, in thiis instruction.
So now, y, the variable y here, happens
to be locating the memory.
'Kay.
So it's locating the memory that is
represented by the contents of register
ebp, which is the base point that's
stacked base pointer.
Plus eight, so what's going to do is
this, using this expression telling the
processor that okay, one or the other
operands in this addition happens to be a
starting memory.
And it's eight bytes away from the
address contains In the register in
register ebp.
Great, okay, so now in t, here, which is
where the result of our of our addition
goes, also happens to be a register,
which is eax.
In fact we eax is both an operand as well
as a destination of the operation.
'Kay?
So, and now, okay, this instruction here,
that's the assembly instruction, and it
became three bytes in in our object code,
in our machine code.
And it happens to be stored in this
address here.
Great.
Alright.
So now, as you saw it, it's very hard to
tell from the, from machine code unless,
you know, you really really can keep your
distance to everything in your head.
It's hard to disassemble the code in your
mind, to look at, to look at the bytes
that correspond to your binary code and
think about what instructions they do.
That's why there's this thing, there's
this process called disassembling.
What disassembling does is it gets the
bits, the, the, the, the bits from your,
from your machine codes.
And maps them back to the original
assembly code.
'Kay?
So if you remember before I showed you
the bytes that composed the, the sum
function.
And here all the bytes, all the 13 bytes
here.
And then this first byte here, 55, just
happens to map into this instruction
called push that pushes something to the
stack.
'Kay.
And now these two bytes here in my
machine code happens to match this move
instruction that moves data between two
registers.
'Kay?
So and the way you do this, but it is
this, this tool, this utility in your in
your Linux installation.
Called objdump.
Just, it dumps the object, it gets the
object and, and dumps its disassembled
assembly[LAUGH], in, into the screen so
you can see.
'Kay, so it's very useful for you to
examine code and I encourage you to run
this command, man one objdump.
It's going to tell you how to use this
tool.
'Kay?
So 'kay.
Now, here's another way, this is just a
different way of looking at this.
I said, this is just a different
formatting.
Now, it's showing that all of the
addresses now are just relative to sum,
that's where sum starts.
So you can can see that this one is one
byte away from sum.
And so on.
'Kay?
and this is what you get if you did this
assembly, with a GDB, debugger.
'Kay?
Great.
So, what can be disassemble?
Well, anything that can be treated, that
can be interpreted as executable code can
be disassembled.
'Kay?
So, you might try to get any random piece
of binary data in this assembly to
disassemble.
You might not get valid instructions.
'Kay, so what makes sense to disassemble
is essentially parts of, parts of of your
memory, right, that happen to store
really binary code, 'kay?
So essentially, anything that can be
interpreted as executable code can be
disassembled.
OKay, so now I know I've already
mentioned registers before, but I just
want to spend one, like a little bit of
time[INAUDIBLE] what is a register.
So a register is just a location in the
CPU that stores a small amount of data.
In fact, one of the reasons that
registers are so fast Is because they are
small.
The speed of light is limited, right?
So if the speed of light is limited, you
make it small, it means you can make it
very, very, very, if things are small,
you can make it very, very fast.
'Kay?
So in register, ours is very fast
locations, but they're very small that
happen to be inside the, the, the
processor.
And, you know, as we go into some of this
assembly programming we'll see that
registers just appear everywhere.
And they happen to be really precious
because you only have a handful of them,
especially in x86.
So let's see why.
'Kay, so we have IA32 has eight
registers.
'Kay.
Have six slices that are called general
purpose, you know, that can hold any data
you want.
And there's like two special purpose
registers, one that holds a pointer to
the stack, and one that holds the pointed
to the base of the stack.
'Kay, so here's
So the origin is mostly obsolete.
But wait to see that this is to be called
eax.
Because it's suppose to be accumulate.
And a lot of searches does accumulate to
eax.
But if don't have to use it like this.
This is just mostly[UNKNOWN] attention.
But you can really use them anyway.
Except for the stack pointer and a base
pointer that really have special meaning,
'kay?
And you use them on, a-, as they were
intended.
Now, so in I, in I32, these registers
here are what?
four by 32, 32 bits wide.
'Kay?
So now in but that's not, that doesn't
mean it can only access all four bytes at
a time, it can, there's some[UNKNOWN]
to[INAUDIBLE] .
So if you use just percent ax, what
you're referring to is only this part of
your eax register.
And even more you can reference only
specific byte, if you, if you use this
name in your assembly code percent you
are just referring to this byte of the
EAX registers.
And the same thing for the other
registers as well.
'Kay?
Except that, you know, for these
registers you can only access half half
order, half register.
So now, let's see how x86-64 registers
look like.
Well, not surprisingly, now, again, what
happened was, you have eax here.
Which is and the other IA 32 registers
are also available.
But they're the only part of larger
registers because these here are a, for
example rax happens to be, what, 64 bits.
'Kay?
And this one here inside is 32 bits.
'Kay?
And the same thing for the other ones.
Now the nice thing about this, is that
you preserve backward compatibility.
Isn't that great?
So now just happen to.
You have the same names and then now
since ratios are bigger, now they have to
super set to divisional other ones.
But i6, doesn't end there.
i6 64 also gives you extra registers.
Now we have all these other registers for
general purpose use that can also be used
by your assembly codes.
Great.
So to, to summarize and end this, this
section, we saw what is the Instruction
Set Architecture?
It defines the system's state and all the
instructions that are available to
software.
We talked a little bit about the history
of Intel processors and AMD as well.
And we looked at how C relates to
assembly and machine code, we gave some
initial examples.
And we talked about x86 registers, which
is really at the, the heart of assembly
programming in x86.
And the thing I want you to remember is
that they have a very, very limited
number, and some of them have special
meaning, so not all of them are
general-purpose.
'Kay?
Thank you.
Now that concludes, this concludes our
section.
Now we're going to, to see next section
we're going to start looking at the
specifics of how to write x86 assembly
code.
Thank you and see you soon.
Wyszukiwarka
Podobne podstrony:
02 Machine Programming02 Środowisko programistyczne2008 02 Extreme Programming i CMMI – kreatywność czy dyscyplina (Cz 3) [Inzynieria Oprogramowania]02 Struktura programu1 02 Korzystanie z zalet zintegrowanego ¶rodowiska programi2006 02 Qt ISO Maker–moja pierwsza aplikacja w Qt [Programowanie]Programowanie i jezyk C Wyklad 02 Instrukcje96 02 W Wayt Gibbs (Programowanie w oceanie pierwotnym02 programowane tryby pracy2004 02 Aplety dla GNOME [Programowanie]JAVA 02 programowanie w systemie Linux2006 02 Tworzenie aplikacji dla Sony PlayStation Portable [Programowanie]2007 02 Programowanie równoległe z Qt [Programowanie]2006 02 Program koncepcyjnyczytnik programator kart chipowych 02programowanie struk i obiekt 20 02 20112006 02 Diagram części Twojego komputera [Programowanie]więcej podobnych podstron