02 Machine Programming


[MUSIC]. Hello again, everyone, so let's, let's continue our, our section on basis of architecture and machine programming. First, let's start with a definition. So what is ar-, what is architecture? Well, architecture, which is the same thing as instruction set architecture, are the parts of the processor design that one needs to understand to write assembly code. 'Kay. Remember, it's the harder software interface, really. It's the harder of the harder software interface. And another way of thinking about it, it's what's directly visible to software. If it's visible to software it means that you have to know about it when writing code. 'Kay. Now the microarchitecture, on the other hand, are details of how the architecture is implemented. This is more details of how the instructions of that architecture is implemented. 'Kay, so let me, let's just think through some examples here. So As we saw briefly, we are going to see more detail later in the course, a cache is a very, is a very fast piece of memory that lies inside the processor. It happens to hold recently accessed data. 'Kay. So now is cache size architecture? Well, it's not directly visible to software. That means that it's not Part of it is used by software but it's not directly visible, the size is not directly visible, so it's not part of the architecture. How about core frequency? Well, core frequency is also not something that software knows explicitly. Software doesn't care all that much about total frequency is. The user does because it make the processor faster. But when writing Assembly code, you don't know, and there's nothing in the ISA that refers to that. So this also what's what, what's core frequency, core frequency is part of the micro Architecture, or the implementation. 'Kay? So and the number of registers, well, the number of registers is something that you need to know, because you're going to use registers explicitly in our assembly code. So number of registers is part of the ISA, therefore is really part of the architecture. 'Kay. So now, with that in mind, let's, let's look at assembly program where its view of you system 'kay? Here we have the CPU the central processing unit, and there're 3 important things. That an assembly programmer needs to know. First, there's this special register, called the program counter, 'kay, and program counter holds the address, it holds the address of the structure of what it's going to execute next. One instruction after the other. And the program counter tells which instructions going to be executed next. 'Kay? By the way, it's also called EIP in I-32, and IP stands for Instruction Pointer, and E's called Extended Instruction Pointer. And Now, In R, RIP. IP, in this case, also means instruction point in x86-64. 'Kay. 'Cuz it's just names for the program counter. now, there's also a bunch of registers that are just used by the assembly code directly. And by the way, these these registers to hold data. Essentially, a piece of very very very fast memory that is, that is addressed. explicitly by your assembly codes. 'Kay? And now, we also have, condition codes. In condition codes, it's, It essentially tells you the result of some, of the status of some operations that was just executed by the instructions. For example it tells you whether or not there was an overflow. 'Kay. It also tells you what, if you're wanting to do a comparison. So you're going to compare whether a system registers a lower value than another register. The result of this comparison is set as a condition code. 'Kay. So now in now on this side here we have memory. And memory, well memory is meant to hold data and codes. That includes the operating system as well. Also, there's a piece of specialty data structure called a stack. Which we're going to see later in the class. It's very very important for program execution. Now, the bottom line here is, that we look here, we have the CPU, here we have memory, the CPU sends adresses to memory when it wants to read data into registers. In the memory, gets data, sends data back to the CPU that puts it in registers. 'Kay? Orders an operation with it. Now the CPU also can sense data to memory. It can get data from registers and say like, hey memory store this data for me. 'Kay? Now another thing that's very, very important is that instructions themselves like object Code also known as object code, comes from memory. It's stored in memory, so that means that the processor has to fetch these instructions from memory in order to execute them, 'kay? And memory is just an array of bytes that has, you know, and each byte has an address, essentially a table, a very large table of bytes, where you can access them directly. 'Kay. So now let's look at how we turn C code, high level C code into Object code, 'kay, into machine code. So here are the steps. So in this example here we have, you know, two source code files, these are .C files that have codes Code or text written in the C language 'kay. Now you are going to use a compiler, GCC is a popular source compiler that that we're going to use in this class, and it's very, it's used by a number of people. Now you are going to use a compiler, GCC is a popular source compiler that that we're going to use in this class, and it's very, it's used by a number of people. And what this compiler is doing, it's taking two C files As input, it's going to compile them and produce a single object file, that is what the minus O is doing here. So, test the output, produce a binary file p that holds the resulting machine code from the source code. So now going to the steps we start with what we call the text. That's the text of your program, the source code. You run through a compiler. 'Kay. And if you use the minus S option. 'Kay. So you're going to produce dot S files that hold the assembly code generated by the compiler. 'Kay? Now, you can. Now you're going to use the assembler. We saw this before. The assembler takes assembly code, and produces binary or machine code, 'kay? And what it's doing here is, it's getting this, this example here. Getting the, the, the, the p1.s and p2.s. And translating them into binary machine code equivalent to the input assembly code. 'kay. So, in the final step there is something called the linker, that what it does is it gets the object code, it has some symbolic names in them and links with libraries. And what are libraries? Libraries are essentially a collection of precompiled code that's available for use in user code. For example, things like to print. Print values on the screen to access operating system services and so on. this all resides in libraries. And the linker gets together the object program, links links it with the libraries and produces a final executable program, that what you can type and and run in the comment prompt. 'Kay? And in this case here, we're also using an optimization. This minus a 1 here is just telling the couple, hey apply some optimization to our code when generating it. 'Kay? so, let me, let's, let's see one example of, of alternazation from a C to assembly. Here we have a C code very simple code sum. That takes two integers as, as input. perform x and y, adds them, puts in a variable called t, and then returns t. 'Kay? So it means when you execute a sum and pass parameters it's going to return the sum of those two values. And that's the, sum is the name of the function, and now when we generate our assembly code what we have here the name of the function, and colon. Saying that the function starts there. And we're going to see this in detail more but you know, this instructions here move data between two, this instruction is moving data betwen two registers. This one is reading data from memory, more specifically it's reading one of the parameters. And this, this is where we're actually doing the addition with the other parameter. And and now we, we are positioned tor result in this return instruction here just returns from the execution of your function. 'Kay, it goes back to the caller of the function. 'Kay. And the way you would obtain this, I mentioned this briefly briefly before, is the compiler, minus S, tells the compiler to produce assembly code. When you give a .c file it is going to produce an equivalent file that is now a .s to host the assembly code. Great. So, now, there's three basic kinds of assembly instructions, 'kay? There are instructions that, that perform arithmetic. Things like addition, substraction, and so on. Both on register data as well as memory data. And there's one type, the other type of instructions that, instructions that move data between memory and the register, because they could load data from memory into a register, 'kay. There's also instructions that get data from a register and stores it into memory. 'Kay, that's the second category. So, transfer between memory, data transfer between memory and registers. And finally the third category of, the third basic category of instructions is ones that control, that transfer control. And control here means what instruction's being executed at any given time. So the flow of control is the flow of instructions that are being executed. 'Kay? This is when, when you do a jump, essential jump, you execute your program here[SOUND]. 'Kay? So some of you can just Jump at a different part of your program and you can jump back and continue executing. 'Kay? So and this is very useful, because when you have something like an if in your program, if something that's actually making a decision of whether or not you should jump to a different part of your code. OKay. So, with that in mind now that we know the basic types of instructions, let's look at the basic data types in Assembly. So now for Assembly, you essentially have words, you have units of data that it would be one, two, or four bytes, in the case of I32. Or also eight bytes in the, in the 64 in the, in the 64 bit ISA case. 'Kay. So in this integer data holds both data values like really program data in the form of integers, as well as addresses. And note that data in assembly is essentially untyped. OKay so, so that means it is a register, could hold, an integer of a certain width or it could also hold a register. The registry doesn't really know it depends on how you use it, that's going to depend on whether it's going to be seen as an address or as a data value. So the other data type most ISA's today support floating point operations in the ISA. So that means now, in floating point it is how things special registers, in the, in the processor. Means that implicitly there is a data type you know a floating point is a data type that is that programmer writing assembly has to be aware of. Is, now what about aggregate data types, aggregate data types in assembly? Is there such a thing? No, in fact, for assembly everything is just a continuously, it's no, memory is flat, and there's no notion of arrays. You have to implement it by hand, by laying out the data and memory properly and then accessing it in a way that makes it look like an array. OKay. So we talked about object code before. Remember that the assembler translates assembly codes into object code, binary codes. And here's an example. Like, remember the sum function that we just saw that adds two numbers? Well, these are the bytes that compose the machine code for the sum function. 'Kay? Cause that suppose to be total of 13 bytes, and there's a bunch of instructions encoded there. you know, some instructions take only 1 bytes, some other instructions take 2 bytes, other takes 3 bytes and so on. And this is also saying that where the, the, the, the code is located in memory starts at address, at this address here. Ok, this is where it starts and this is also saying that it's first, the address of the first byte here of this function. Okay? So, as you see, it's actually not obvious where the instructions start. And then, because they have different sizes, and you have to decode them in order to understand what they do. 'Kay? So now I mentioned this before briefly that what the linker does is essentially it resolves some symbolic names that are embedded into your object code. And also links what's missing like if your code happens to use some system library functions like malloc to allocate memory or print stuff on the screen. this has to be linked, you know, your code has to point to the right place in this library. All right so let me show you an example of a machine instruction here again. So remember that in our sum function we had this statement here, int t equals x plus y. Which just adds, x and y, and put into variable t. Well there's an add here so it's only natural that in our assembly code there will be something that looks like an add. And that's the instructions in the middle by the compiler to implement that that expression. 'Kay? So the C Code here, we are trying to add two integers and an assembly code. What we are doing is we're adding two four-byte intergers. 'Kay. So, in by the way, note that is the same instruction whether it's signed or unsigned. 'Kay, so Gaetano told you about signed and unsigned integers, right? So, you, you know about that now but the instructions doesn't know whether it's adding a, a signed number or not. So, now in. So, here are the operators of our operation. So, we are doing this addition here. Right, x plus y. Now x, has to be somewhere. And in this case it happens to be stored in a register called eax. That's one of the registers offered in the x86 ISA. And that's why it's on the perimeters in this function here. In, in, in, in thiis instruction. So now, y, the variable y here, happens to be locating the memory. 'Kay. So it's locating the memory that is represented by the contents of register ebp, which is the base point that's stacked base pointer. Plus eight, so what's going to do is this, using this expression telling the processor that okay, one or the other operands in this addition happens to be a starting memory. And it's eight bytes away from the address contains In the register in register ebp. Great, okay, so now in t, here, which is where the result of our of our addition goes, also happens to be a register, which is eax. In fact we eax is both an operand as well as a destination of the operation. 'Kay? So, and now, okay, this instruction here, that's the assembly instruction, and it became three bytes in in our object code, in our machine code. And it happens to be stored in this address here. Great. Alright. So now, as you saw it, it's very hard to tell from the, from machine code unless, you know, you really really can keep your distance to everything in your head. It's hard to disassemble the code in your mind, to look at, to look at the bytes that correspond to your binary code and think about what instructions they do. That's why there's this thing, there's this process called disassembling. What disassembling does is it gets the bits, the, the, the, the bits from your, from your machine codes. And maps them back to the original assembly code. 'Kay? So if you remember before I showed you the bytes that composed the, the sum function. And here all the bytes, all the 13 bytes here. And then this first byte here, 55, just happens to map into this instruction called push that pushes something to the stack. 'Kay. And now these two bytes here in my machine code happens to match this move instruction that moves data between two registers. 'Kay? So and the way you do this, but it is this, this tool, this utility in your in your Linux installation. Called objdump. Just, it dumps the object, it gets the object and, and dumps its disassembled assembly[LAUGH], in, into the screen so you can see. 'Kay, so it's very useful for you to examine code and I encourage you to run this command, man one objdump. It's going to tell you how to use this tool. 'Kay? So 'kay. Now, here's another way, this is just a different way of looking at this. I said, this is just a different formatting. Now, it's showing that all of the addresses now are just relative to sum, that's where sum starts. So you can can see that this one is one byte away from sum. And so on. 'Kay? and this is what you get if you did this assembly, with a GDB, debugger. 'Kay? Great. So, what can be disassemble? Well, anything that can be treated, that can be interpreted as executable code can be disassembled. 'Kay? So, you might try to get any random piece of binary data in this assembly to disassemble. You might not get valid instructions. 'Kay, so what makes sense to disassemble is essentially parts of, parts of of your memory, right, that happen to store really binary code, 'kay? So essentially, anything that can be interpreted as executable code can be disassembled. OKay, so now I know I've already mentioned registers before, but I just want to spend one, like a little bit of time[INAUDIBLE] what is a register. So a register is just a location in the CPU that stores a small amount of data. In fact, one of the reasons that registers are so fast Is because they are small. The speed of light is limited, right? So if the speed of light is limited, you make it small, it means you can make it very, very, very, if things are small, you can make it very, very fast. 'Kay? So in register, ours is very fast locations, but they're very small that happen to be inside the, the, the processor. And, you know, as we go into some of this assembly programming we'll see that registers just appear everywhere. And they happen to be really precious because you only have a handful of them, especially in x86. So let's see why. 'Kay, so we have IA32 has eight registers. 'Kay. Have six slices that are called general purpose, you know, that can hold any data you want. And there's like two special purpose registers, one that holds a pointer to the stack, and one that holds the pointed to the base of the stack. 'Kay, so here's So the origin is mostly obsolete. But wait to see that this is to be called eax. Because it's suppose to be accumulate. And a lot of searches does accumulate to eax. But if don't have to use it like this. This is just mostly[UNKNOWN] attention. But you can really use them anyway. Except for the stack pointer and a base pointer that really have special meaning, 'kay? And you use them on, a-, as they were intended. Now, so in I, in I32, these registers here are what? four by 32, 32 bits wide. 'Kay? So now in but that's not, that doesn't mean it can only access all four bytes at a time, it can, there's some[UNKNOWN] to[INAUDIBLE] . So if you use just percent ax, what you're referring to is only this part of your eax register. And even more you can reference only specific byte, if you, if you use this name in your assembly code percent you are just referring to this byte of the EAX registers. And the same thing for the other registers as well. 'Kay? Except that, you know, for these registers you can only access half half order, half register. So now, let's see how x86-64 registers look like. Well, not surprisingly, now, again, what happened was, you have eax here. Which is and the other IA 32 registers are also available. But they're the only part of larger registers because these here are a, for example rax happens to be, what, 64 bits. 'Kay? And this one here inside is 32 bits. 'Kay? And the same thing for the other ones. Now the nice thing about this, is that you preserve backward compatibility. Isn't that great? So now just happen to. You have the same names and then now since ratios are bigger, now they have to super set to divisional other ones. But i6, doesn't end there. i6 64 also gives you extra registers. Now we have all these other registers for general purpose use that can also be used by your assembly codes. Great. So to, to summarize and end this, this section, we saw what is the Instruction Set Architecture? It defines the system's state and all the instructions that are available to software. We talked a little bit about the history of Intel processors and AMD as well. And we looked at how C relates to assembly and machine code, we gave some initial examples. And we talked about x86 registers, which is really at the, the heart of assembly programming in x86. And the thing I want you to remember is that they have a very, very limited number, and some of them have special meaning, so not all of them are general-purpose. 'Kay? Thank you. Now that concludes, this concludes our section. Now we're going to, to see next section we're going to start looking at the specifics of how to write x86 assembly code. Thank you and see you soon.

Wyszukiwarka

Podobne podstrony:
02 Machine Programming
02 Środowisko programistyczne
2008 02 Extreme Programming i CMMI – kreatywność czy dyscyplina (Cz 3) [Inzynieria Oprogramowania]
02 Struktura programu
1 02 Korzystanie z zalet zintegrowanego ¶rodowiska programi
2006 02 Qt ISO Maker–moja pierwsza aplikacja w Qt [Programowanie]
Programowanie i jezyk C Wyklad 02 Instrukcje
96 02 W Wayt Gibbs (Programowanie w oceanie pierwotnym
02 programowane tryby pracy
2004 02 Aplety dla GNOME [Programowanie]
JAVA 02 programowanie w systemie Linux
2006 02 Tworzenie aplikacji dla Sony PlayStation Portable [Programowanie]
2007 02 Programowanie równoległe z Qt [Programowanie]
2006 02 Program koncepcyjny
czytnik programator kart chipowych 02
programowanie struk i obiekt 20 02 2011
2006 02 Diagram części Twojego komputera [Programowanie]

więcej podobnych podstron