211: Computer Architecture Spring 2021
Instructor: Prof. David Menendez
Topics:
Hardware-Software Interface
AssemblyProgramming
Reading: Chapter 3
Programming Meets Hardware
High-Level Language Program
Compiler
Assembly Language Program
Assembler
Machine Language Program
#include
int x, y, temp;
x=1; y=2;
temp =x; x=y; y=temp; printf(%d %d %d
,x,y,temp);
}
How do you get performance?
7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01
00 00 00 f0 82 04 08 34 00 00 00 c4 0c 00 00 00 00 00 00 34 00
movl $1, -8(%ebp)
movl $2, -12(%ebp)
movl -8(%ebp), %eax
movl %eax, -16(%ebp)
movl -12(%ebp), %eax
movl %eax, -8(%ebp)
movl -16(%ebp), %eax
movl %eax, -12(%ebp)
movl -16(%ebp), %eax ISA movl %eax, 12(%esp)
movl -12(%ebp), %eax movl %eax, 8(%esp) movl -8(%ebp), %eax movl %eax, 4(%esp)
Performance with Programs
(1) Program: Data structures + algorithms (2) Compiler translates code
(3) Instruction set architecture
(4) Hardware Implementation
Rutgers University David Menendez 3
Instruction Set Architecture
(1) Set of instructions that the CPU can execute
(1) What instructions are available?
(2) How the instructions are encoded? Eventually everything is binary.
(2) State of the system (Registers + memory state + program counter)
(1) What instruction is going to execute next
(2) How many registers? Width of each register?
(3) How do we specify memory addresses?
Addressing modes
(3) Effect of instruction on the state of the system
Rutgers University David Menendez 4
IA32 (X86 ISA)
There are many different assembly languages because they are processor-specific
IA32 (x86)
x86-64 for new 64-bit processors
IA-64 radically different for Itanium processors
Backward compatibility: instructions added with time
We will focus on IA32/x86-64 because you can generate and run on iLab machines (as well as your own PC/laptop)
PowerPC MIPS
IA32 is also dominant in the market although smart phone, eBook readers, etc. are changing this
Rutgers University David Menendez 5
X86 Evolution
8086 1978 29K transistors 5-10MHz
I386 1985 275K transistors 16-33 MHz Pentium4 2005 230M transistors 2800-3800 MHz Haswell 2013 > 2B transistors 3200-3900 MHz
Added features
Large caches
Multiple cores
Support for data parallelism (SIMD) eg AVX extensions
Rutgers University David Menendez 6
CISC vs RISC
CISC: complex instructions : eg X86
Instructions such as strcpy/AES and others Reduces code size
Hardware implementation complex?
RISC: simple instructions: eg Alpha Instructions are simple add/ld/st
Increases code size
Hardware implementation simple?
Rutgers University David Menendez 7
Aside About Implementation of x86
About 30 years ago, the instruction set actually reflected the processor hardware
As hardware advanced, industry faced with choice
E.g., the set of registers in the instruction set is actually what was present in the processor
Change the instruction set: bad for backward compatibility
Keep the instruction set: harder to exploit hardware advances
Example: many more registers but only small set introduced circa 1980
Starting with the P6 (PentiumPro), IA32 actually got implemented by Intel using an interpreter that translates IA32 instructions into a simpler micro instruction set
Rutgers University David Menendez 8
P6 Decoder/Interpreter
Assembly Programming
Brief tour through assembly language programming Why?
Why not binary language?
Machine interface: where software meets hardware
To understand how the hardware works, we have to understand the interface that it exports
Much easier for humans to read and reason about
Major differences:
Human readable language instead of binary sequences Relative instead of absolute addresses
Rutgers University David Menendez 10
Assembly Programmers View
CPU
Memory
ALU
Control Logic
Condition Codes
PC
Registers
(OS code & data) Object Code Program Data
Addresses
Data
Instructions
CPU
Memory
Memory
Address
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Command: R/W
Data
Addresses
Memory Access: Read
CPU
Memory
03
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
R
01010101
Addresses
Memory Access: Write
CPU
Memory
04
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
W
01010101
Addresses
Memory Access: Write
CPU
Memory
04
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
W
01010101
Addresses
C
S
AB
Processor: ALU & Registers
C = FS(A, B)
F includes
Arithmetic: +, -, *, /, ~, etc. Logical: <, >, =, etc.
ALU
Registers
Name Storage
R0 R1 R2 R3
00101100
10001000
11111111
01010101
Multiple Ports
Name Command: R/W Data
Putting It All Together
CPU
Memory
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Condition Codes
Program Counter (PC)
Control Logic
ALU
Registers
Addresses
Putting It All Together
CPU
Memory
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Condition Codes
Program Counter (PC)
1
ALU
Control Logic
Registers
Addresses
Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
1
ALU
Registers
Control Logic
10001000
1
Addresses
Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
+
ALU
Control Logic
1
Registers
R0: x R1: y
R0 R1
10001000
1
Addresses
Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
+
ALU
x
1
y
Registers
R0: x R1: y
R0 R1
Control Logic
10001000
1
Addresses
Putting It All Together
CPU
Memory
Condition Codes
Program Counter (PC)
z
+
ALU
x
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
2
y
Registers
R0: x R1: y
R2: z
R0 R1
Control Logic
R2
10001000
1
Addresses
Sample C Code
int accum;
int sum(int x, int y)
{
gcc -O1 -m32 S code.c
Generated Assembly
sum:
push %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
addl 8(%ebp), %eax
addl %eax, accum
popl %ebp
ret
}
int t = x + y;
accum += t;
return t;
Rutgers University
David Menendez 23
C & Assembly Code
Sample C Code
int accum;
int sum(int x, int y){
int t = x + y;
accum += t;
return t;
objdump d code.o
C & Machine Code
} 3:
6: 03 45 08
gcc -O1 -m32 c code.c 9:
gdb code.o
(gdb) x/100xb sum
f: 5d 10: c3
pop %ebp ret
0x8
0x10
0000000
0: 55
1: 89 e5
push %ebp
mov %esp,%ebp
mov 0xc(%ebp),%eax add 0x8(%ebp),%eax
8b 45 0c
01 05 00 00 00 00 add %eax, accum
Rutgers University David Menendez 24
Assembly Characteristics
Sequence of simple instructions
Minimal Data Types
No type checking
Integer data of 1, 2, or 4 bytes
Data values
Addresses (untyped pointers)
Floating point data of 4, 8, or 10 bytes
No aggregate types such as arrays or structures
Just contiguously allocated bytes in memory
Interpretation of data format depends on instruction No protection against misinterpretation of data
Rutgers University David Menendez 25
Assembly Characteristics
3 types of Primitive Operations
Perform arithmetic function on register or memory data
Transfer data between memory and register
Load data from memory into register Store register data into memory
Transfer control
Unconditional jumps to/from procedures Conditional branches
Rutgers University David Menendez 26
x86 Characteristics
Variable length instructions: 1-15 bytes
Can address memory directly in most instructions
Uses Little-Endian format (Least significant byte in the lowest address)
Rutgers University David Menendez 27
General format:
opcode operands
Opcode:
movb,addl, etc. Operands:
Example:
Instruction Format
Short mnemonic for instructions purpose
Immediate, register, or memory
Number of operands command-dependent
movl %ebx, (%ecx)
Rutgers University David Menendez 28
Machine Representation
Remember, each assembly instruction translated to a sequence of 1-15 bytes
Opcode
addressing mode
other bytes
First, the binary representation of the opcode Second, instruction specifies the addressing mode
Some instructions can be single-byte because operands and addressing mode are implicitly specified by the instruction
The type of operands (registers or register and memory) How to interpret the operands
E.g., pushl
Rutgers University David Menendez 29 29
x86 Registers
General purpose registers are 32 bit
Originally categorized into two groups with different functionality
Now, the registers are mostly interchangeable
Segment registers
Although operations can access 8-bits or 16-bits portions
Data registers (EAX, EBX, ECX, EDX)
Holds operands
Pointer and Index registers (EBP, ESP, EIP,ESI,EDI)
Holds references to addresses as well as indexes
Holds starting address of program segments
CS, DS, SS, ES
Rutgers University David Menendez 30 30
x86 Registers
16 BITS
8 BITS
EAX AX
AH
AL
ECX CX
CH
CL
EDX DX
DH
DL
EBX BX
BH
BL
ESP Stack Pointer
EBP Base register of current stack frame
ESI Source index for string operations
EDI Destination index for string operations
32 BITS Rutgers University David Menendez
31
x86 Programming
Mov instructions to move data from/to memory
Addressing modes
Understanding swap
Arithmetic operations
Condition codes
Conditional and unconditional branches Loops and switch statements
Operands and registers
Rutgers University David Menendez 32
Byte: 8 bits
Word: 16 bits (2 bytes)
E.g., char
E.g., short int
Data Format
Double Word: 32 bits ( 4 bytes)
Quad Word: 64 bits (8 bytes)
Instructions can operate on any data size
movl, movw, movb
Move double word, word, byte, respectively
E.g., int, float
E.g., double
End character specifies what data size to be used
Rutgers University David Menendez 33
MOV instruction
Most common instruction is data transfer instruction
Mov SRC, DEST: Move source into destination
SRC and DEST are operands
DEST is a register or a location
SRC can be the contents of register, memory location, constant, or a label.
If you use gcc, you will see movl
Alltheinstructionsinx86are32-bit
Used to copy data:
Constant to register (immediate)
Memory to register
Register to memory
Register to register
Cannot copy memory to memory in a single instruction
Rutgers University
David Menendez
34 34
Immediate Addressing
Operand is immediate
Operand value is found immediately following the instruction Encoded in 1, 2, or 4 bytes
$ in front of immediate operand E.g., movl $0x4040, %eax
ADDR
000f
00A1 00A2
memory
movl %eax
4040
Rutgers University David Menendez
35
Register Mode Addressing
Use % to denote register
Source operand: use value in specified register
Destination operand: use register as destination for value
Examples:
E.g., %eax
movl %eax, %ebx
Copy content of %eax to %ebx
movl $0x4040, %eax ! immediate addressing Copy 0x4040 to %eax
movl %eax, 0x0000f !Absolute addressing
Copy content of %eax to memory location 0x0000f
Rutgers University David Menendez 36 36
Indirect Mode Addressing
Content of operand is an address
Offset can be specified as immediate mode
Examples:
Designated as parenthesis around operand
movl (%ebp), %eax
Copy value from memory location whose address is in ebp into eax
movl -4(%ebp), %eax
Copy value from memory location whose address is -4 away from content of ebp into eax
Rutgers University David Menendez 37
Indexed Mode Addressing
Add content of two registers to get address of operand
Useful for dealing with arrays
movl (%ebp, %esi), %eax
Copy value at (address = ebp + esi) into eax
movl 8(%ebp, %esi),%eax
Copy value at (address = 8 + ebp + esi) into eax
If you need to walk through the elements of an array Use one register to hold base address, one to hold index
E.g., implement C array access in a for loop
Index cannot be ESP
Rutgers University David Menendez 38 38
Scaled Indexed Mode Addressing
Multiply the second operand by the scale (1, 2, 4 or 8)
movl 0x80 (%ebx, %esi, 4), %eax
Copy value at (address = ebx + esi*4 + 0x80) into eax
Where is it useful?
Rutgers University David Menendez 39 39
Address Computation Examples
%edx
0xf000
%ecx
0x100
Expression
Computation
Address
0x8(%edx)
0xf000 + 0x8
0xf008
(%edx,%ecx)
0xf000 + 0x100
0xf100
(%edx,%ecx,4)
0xf000 + 4*0x100
0xf400
0x80(,%edx,2)
2*0xf000 + 0x80
0x1e080
40
movl Operand Combinations
Source
Imm
movl
Destination C Analog
Reg Mem
movl $0x4,%eax temp = 0x4; movl$-147,(%eax) *p=-147;
Cannot do memory-memory transfers with single instruction
Reg
Mem Reg
Reg Mem
movl %eax,%edx
movl %eax,(%edx)
movl (%eax),%edx
temp2 = temp1;
*p = temp;
temp = *p;
Rutgers University David Menendez 41
Stack Operations
By convention, %esp is used to maintain a stack in memory
%esp contains the address of top of stack
Instructions to push (pop) content onto (off of) the stack
Where does the stack start? Well discuss later
Used to support C function calls
pushl %eax
esp=esp4
Memory[esp] = eax
popl %ebx
ebx = Memory[esp] esp=esp+4
Rutgers University David Menendez 42
Using Simple Addressing Modes
swap:
pushl %ebp
movl %esp,%ebp Set
pushl %ebx
Up
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
Body
Finish
Understanding Swap
Offset
12 8 4 0 -4
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
Stack
yp
xp
Rtn adr
Old %ebp
Old %ebx
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
%ebp
# ecx = yp
# edx = xp #eax=*yp(t1) #ebx=*xp(t0) #*xp=eax #*yp=ebx
Register Variable
%ecx yp %edx xp %eax t1 %ebx t0
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx #
ecx = yp
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# # # # #
edx=xp
eax = *yp (t1) ebx = *xp (t0) *xp = eax
*yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
0x124
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx #
ecx = yp
edx=xp
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
#
# # # #
eax = *yp (t1)
ebx = *xp (t0)
*xp = eax
*yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
456
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx
Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
456
123
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx
Swap in x86-64: 64-bit Registers
rax
eax
rcx
ecx
rdx
edx
rbx
ebx
rsp
esp
rbp
ebp
rsi
esi
rdi
edi
r8
r9
r10
r11
r12
r13
r14
r15
Swap in x86-64 bit
swap:
movl (%rdi), %edx
movl (%rsi), %eax
movl%eax, (%rdi)
movl%edx, (%rsi)
retq
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
Arguments passed in registers
No stack operations
What happens with long int?
First, xp in rdi and yp in rsi
64-bit pointers, data values are 32-bit ints, so uses eax/edx
IA32 Stack
Register%espindicates lowest stack address
address of top element
Stack Pointer
%esp
Stack Bottom
Increasing Addresses
Region of memory managed with stack discipline
Grows toward lower addresses
Stack Grows Down
Rutgers University
David Menendez
54
Stack Top
IA32 Stack Pushing
Pushing
pushl Src
Decrement%espby4
Stack Bottom
Increasing Addresses
Fetch operand at Src Write operand at address
given by %esp
Stack Pointer
-4
Stack Grows Down
%esp
Rutgers University
David Menendez
55
Stack Top
IA32 Stack Popping
Popping
popl Dest
Increment%espby4
Stack Bottom
Increasing Addresses
Read operand at address given by %esp
Write to Dest
Stack Pointer
%esp
+4
Stack Grows Down
Rutgers University
David Menendez
56
Stack Top
Stack Operation Examples
pushl %eax
popl %edx
123
123
213
123
213
0x110
0x10c
0x108
0x110
0x10c
0x108
0x104
%eax
%edx
%esp
0x110
0x10c
0x108
0x104
%eax
%edx
%esp
213
555
0x108
213
555
0x1084
213
525153
0x1048
%eax
%edx
%esp
Procedure Control Flow
Use stack to support procedure call and return Procedure call:
call label Push return address on stack; Jump to label
Return address value
Address of instruction beyond call Example from disassembly
804854e: e8 3d 06 00 00 8048553: 50
Return address = 0x8048553 Procedure return:
call 8048b90
pushl%eax
ret Pop address from stack; Jump to address
Rutgers University David Menendez 58
0x110
0x10c
0x108
%esp %eip
0x108
0x804854e
Procedure Call Example
804854e:e8 3d 06 00 00call 8048b90
8048553: 50
pushl%eax
call 8048b90
0x110
0x10c
0x108
0x104
%esp 0x1084 %eip 0x80485b49e0
123
123
0x8048553
%eip is program counter
8048591: c3
Procedure Return Example
ret
0x110
0x10c
0x108
0x104
%esp %eip
0x110
0x10c
0x108
ret
123
0x8048553
123
0x8048553
0x104
0x8048591
%esp 0x1048 %eip 0x804855931
%eip is program counter
Address Computation Instruction
leal: compute address using addressing mode without accessing memory
leal src, dest
Use
Example:
src is address mode expression Set dest to address specified by src
Computing address without doing memory reference
E.g., translation of p = &x[i];
leal 7(%edx, %edx, 4), %eax
eax=4*edx+edx+7=5*edx+7
Rutgers University David Menendez 61
Some Arithmetic Operations
Instruction addl Src,Dest subl Src,Dest imull Src,Dest sall Src,Dest sarl Src,Dest xorl Src,Dest andl Src,Dest orl Src,Dest
Computation
Dest = Dest + Src
Dest = Dest Src
Dest = Dest * Src
Dest = Dest << Src (left shift) Dest = Dest >> Src (right shift) Dest = Dest ^ Src
Dest = Dest & Src Dest = Dest | Src
Rutgers University
David Menendez 62
Some Arithmetic Operations
Instruction
incl Dest decl Dest negl Dest notl Dest
Computation
Dest = Dest + 1 Dest = Dest 1 Dest = Dest Dest = ~ Dest
Rutgers University
David Menendez 63
Using leal for Arithmetic Expressions
arith:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%eax
movl 12(%ebp),%edx
leal (%edx,%eax),%ecx
leal (%edx,%edx,2),%edx
sall $4,%edx
addl 16(%ebp),%ecx
leal 4(%edx,%eax),%eax
imull %ecx,%eax
movl %ebp,%esp
popl %ebp
ret
Set Up
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
Body
Finish
Understanding arith Offset
16 12 8 4 0
Stack
z
y
x
Rtn adr
Old %ebp
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
movl 8(%ebp),%eax
movl 12(%ebp),%edx
leal (%edx,%eax),%ecx
leal (%edx,%edx,2),%edx# edx = 3*y
sall $4,%edx
addl 16(%ebp),%ecx
leal 4(%edx,%eax),%eax
imull %ecx,%eax
# edx = 48*y (t4)
# ecx = z+t1 (t2)
# eax = 4+t4+x (t5)
# eax = t5*t2 (rval)
# eax = x
# edx = y
# ecx = x+y(t1)
%ebp
Understanding arith
# eax = x
movl 8(%ebp),%eax
# edx = y
movl 12(%ebp),%edx
# ecx = x+y(t1)
leal (%edx,%eax),%ecx
# edx = 3*y
leal (%edx,%edx,2),%edx
# edx = 48*y (t4)
sall $4,%edx
# ecx = z+t1 (t2)
addl 16(%ebp),%ecx
# eax = 4+t4+x (t5)
leal 4(%edx,%eax),%eax
# eax = t5*t2 (rval)
imull %ecx,%eax
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
Another Example
logical:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%eax
xorl 12(%ebp),%eax
sarl $17,%eax
andl $8185,%eax
movl %ebp,%esp
popl %ebp
ret
eax = x
eax = x^y (t1) eax = t1>>17 (t2) eax = t2 & 8185
Set Up
int logical(int x, int y)
{
int t1 = x^y;
int t2 = t1 >> 17;
int mask = (1<<13) – 7;int rval = t2 & mask;return rval;} Body Finish213 =8192,213 7=8185movl 8(%ebp),%eaxxorl 12(%ebp),%eaxsarl $17,%eaxandl $8185,%eax Mystery FunctionWhat does the following piece of code do?A. Add two variablesB. Subtract two variablesC. Swap two variablesD. No ideamovl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)Rutgers UniversityDavid Menendez 68 What does this function do?.globl foo.type foo, @functionfoo:pushl %ebpmovlmovlimulladdlpopl %ebp ret%esp, %ebp16(%ebp), %eax 12(%ebp), %eax 8(%ebp), %eax Control Flow/ConditionalsHow do we represent conditionals in assembly?A conditional branch can implement all control flow constructs in higher level language Examples: if/then, while, forA unconditional branch for constructs like break/ continue Condition CodesSingle Bit RegistersCF Carry Flag SF Sign FlagZF Zero Flag OF Overflow FlagCan be set either implicitly or explicitly. Implicitly by almost all logic and arithmetic operations Explicitly by specific comparison operationsNot Set by leal instruction Intended for use in address computation onlyRutgers University David Menendez 71 jX InstructionsJumpingJump to different part of code depending on condition codes jXCondition Descriptionjmp1Unconditional jeZFEqual / Zero jne ~ZFNot Equal / Not ZerojsSFNegative jns ~SFNonnegativejg ~(SF^OF)&~ZFGreater (Signed)jge~(SF^OF)Greater or Equal (Signed) jl(SF^OF)Less (Signed) jle (SF^OF)|ZFLess or Equal (Signed)ja~CF&~ZFAbove (unsigned) jb CF Below (unsigned) Rutgers University David Menendez 72 Condition CodesImplicitly Set By Arithmetic Operationsaddl Src,DestCanalog: t=a+b CF set if carry out from most significant bit Used to detect unsigned overflow ZFsetift==0 SFsetift<0 OF set if twos complement overflow(a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0)
Rutgers University David Menendez 73
Setting Condition Codes (cont.)
Explicit Setting by Compare Instruction
cmpl Src2,Src1
cmpl b,a like computing a-b without setting destination
NOTE: The operands are reversed. Source of confusion
CF set if carry out from most significant bit
Used for unsigned comparisons
ZFsetifa==b
SFsetif(a-b)<0OF set if twos complement overflow (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-
b)>0)
Rutgers University David Menendez 74
Setting Condition Codes (cont.)
Explicit Setting by Test instruction
testl Src2,Src1
SetsconditioncodesbasedonvalueofSrc1&Src2
Useful to have one of the operands be a mask
testl b,a like computing a&b without setting destination ZF set whena&b == 0
SFsetwhena&b<0Rutgers University David Menendez 75 Conditional Branch Example_max: pushl %ebp movl %esp,%ebpSet UpBodymovl 8(%ebp),%edx movl 12(%ebp),%eax cmpl %eax,%edx jle L9 movl %edx,%eaxL9: movl %ebp,%esp popl %ebp retFinish Conditional Branch Example_max: pushl %ebp movl %esp,%ebpSet UpBodyint max(int x, int y){if (x <= y)return y;elsereturn x; } movl 8(%ebp),%edx movl 12(%ebp),%eax cmpl %eax,%edx jle L9 movl %edx,%eaxL9: movl %ebp,%esp popl %ebp retFinish Conditional Branch Example (Cont.) int max(int x, int y){if (x <= y)return y;elsereturn x; } int goto_max(int x, int y){int rval = y;int ok = (x <= y);if (ok)goto done;rval = x;done:return rval;}movl 8(%ebp),%edxmovl 12(%ebp),%eax # eax =x ygoto L9xcmpl %eax,%edxjle L9movl %edx,%eaxL9: # x : y # if <= # eax =# Done:Generally considered bad coding style# edx =C allows goto as means of transferring control Closer to machine- level programming style Rutgers UniversityDavid Menendez 78Skipped when x y .LC0:.string “%d.text .globl foo.type foo, @function foo:pushl %ebpmovlsubllealmovlmovlcall scanfcmpl $4, -12(%ebp) je .L3call explode_bomb .L3:leave .p2align 4,,3 retRutgers UniversityDavid Menendez 79%esp, %ebp $40, %esp-12(%ebp), %eax %eax, 4(%esp) $.LC0, (%esp)Mystery Function C CodeDo-While Loop Example int fact_do(int x){int result = 1;do {result *= x;x = x-1;} while (x > 1);
return result;
}
Rutgers University David Menendez 80
Do-While Loop Example
C Code Goto Version
int fact_do(int x)
{
int result = 1;
do {
result *= x;
x = x-1;
} while (x > 1);
return result;
}
int fact_goto(int x)
{
int result = 1;
loop:
result *= x;
x = x-1;
if (x > 1)
goto loop; return result;
}
Use backward branch to continue looping Only take branch when while condition holds
Rutgers University David Menendez 81
Do-While Loop Compilation
Goto Version
Assembly
int fact_goto(int x)
{
int result = 1;
loop:
result *= x;
x = x-1;
if (x > 1)
goto loop; return result;
}
Registers
%edx x
%eax result
_fact_goto:
pushl %ebp
movl %esp,%ebp
movl $1,%eax
movl 8(%ebp),%edx
L11:
imull %edx,%eax
decl %edx
cmpl $1,%edx
jg L11
movl %ebp,%esp
popl %ebp
ret
# Setup
# Setup
# eax = 1
# edx = x
# result *= x
# x
# Compare x : 1
# i
Reviews
There are no reviews yet.