Microsoft PowerPoint CSE220 Unit03 MIPS Assembly Basics.pptx
1
1Kevin McDonnell Stony Brook University CSE 220
CSE 220:
Systems Fundamentals I
Unit 3:
MIPS Assembly:
Basic Instructions,
System Calls, Endianness
2Kevin McDonnell Stony Brook University CSE 220
Computer Instructions
Recall that a computers architecture is the programmers
view of a computer
The architecture is defined, in part, by its instruction set
These instructions are encoded in binary as the
architectures machine language
Because reading and writing machine language is tedious,
instructions are represented using mnemonics (neh-
MAHN-icks) as assembly language
You are going to be learning the fundamentals of the MIPS
architecture in this course, so that means we will be
covering MIPS assembly language
3Kevin McDonnell Stony Brook University CSE 220
MIPS Architecture Design Principles
1. Simplicity favors regularity
Simple, similar instructions are easier to encode and
handle in hardware
2. Make the common case fast
MIPS architecture includes only simple, commonly used
instructions
3. Smaller is faster
Smaller, simpler circuits will execute faster than large,
complicated ones that implement a complex instruction
set
4. Good design demands good compromises
We just try to minimize their number
4Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly
Why MIPS and not x86?
Like x86, MIPS CPUs are used in real products, but mostly
embedded systems like network routers
The MIPS architecture is simpler than x86 architecture,
which makes the assembly language simpler and easier to
learn
Good textbooks and educational resources exist for MIPS
Once you learn one architecture and its assembly language,
others are easy to learn
Fun fact: MIPS was invented by John Hennessy, the former
president of Stanford University and alumnus of Stony
Brooks MS and PhD programs in computer science
2
5Kevin McDonnell Stony Brook University CSE 220
Instructions: Addition
Java code: MIPS assembly code:
a = b + c; add a, b, c
add: mnemonic indicates operation to perform
b, c: source operands (on which the operation is to be
performed)
a: destination operand (to which the result is written)
In MIPS, a, b and c are actually CPU registers. More on this
soon.
6Kevin McDonnell Stony Brook University CSE 220
Instructions: Subtraction
Java code: MIPS assembly code:
a = b c; sub a, b, c
sub: mnemonic
b, c: source operands
a: destination operand
It should come as no surprise that the code is virtually the
same as for an addition
1. Simplicity favors regularity
Consistent instruction format
Same number of operands (two sources and one
destination)
Easier to encode and handle in hardware
7Kevin McDonnell Stony Brook University CSE 220
Multiple Instructions
More complex code is handled by multiple MIPS instructions
Java code: MIPS assembly code:
a = b + c d; add t, b, c# t = b + c
sub a, t, d# a = t d
In MIPS assembly, the # symbol denotes a comment
2. Make the common case fast
More complex instructions (that are less common) are
performed using multiple simple instructions
MIPS is a reduced instruction set computer (RISC), with
a small number of simple instructions
Other architectures, such as Intels x86, are complex
instruction set computers (CISC)
8Kevin McDonnell Stony Brook University CSE 220
Operands
Operand location: physical location in computer
Registers: MIPS has thirty-two 32-bit registers
Faster than main memory, but much smaller
3. Smaller is faster: reading data from a small set of
registers is faster than from a larger one (simpler
circuitry)
MIPS is a 32-bit architecture because it operates on
32-bit data
Memory
Constants (also called immediates)
Included as part of an instruction
3
9Kevin McDonnell Stony Brook University CSE 220
MIPS Register Set
Name Register Number Usage
$0 0 the constant value 0
$at 1 assembler temporary
$v0-$v1 2-3 function return values
$a0-$a3 4-7 function arguments
$t0-$t7 8-15 temporaries
$s0-$s7 16-23 saved variables
$t8-$t9 24-25 more temporaries
$k0-$k1 26-27 OS temporaries
$gp 28 global pointer
$sp 29 stack pointer
$fp 30 frame pointer
$ra 31 function return address
10Kevin McDonnell Stony Brook University CSE 220
Operands: Registers
Registers:
$ before name
Example: $0, register zero, dollar zero
Registers are used for specific purposes:
$0 always holds the constant value 0.
The saved registers, $s0-$s7, are used to hold
variables
The temporary registers, $t0-$t9, are used to hold
intermediate values during a larger computation
We will discuss other registers later
Programming in MIPS assembly demands that you follow
certain rules (called conventions) when using registers.
11Kevin McDonnell Stony Brook University CSE 220
Instructions with Registers
Lets revisit the add instruction
Java code: MIPS assembly code:
a = b + c; # $s0 = a, $s1 = b, $s2 = c
add $s0, $s1, $s2
When programming in a high-level language like Java, you
generally dont (and shouldnt) comment every single line
of code
With MIPS assembly you should!
Assign meaning to registers and calculations
Even simple formulas will have to be implemented using
at least several lines of assembly code
So comment EVERYTHING
12Kevin McDonnell Stony Brook University CSE 220
Operands: Memory
A typical program uses too much data to fit in only 32
registers
Store more data in memory
Memory is large, but slow
Commonly used variables kept in registers
Less frequently used values will need to be copied from
registers to memory for safe keeping when we run out
of registers
Later we will need to copy the values from memory back
to the register file when we need to do a calculation with
them
4
13Kevin McDonnell Stony Brook University CSE 220
Operands: Memory
Each 32-bit data word has a unique 32-bit address
A word is the unit of data used natively by a CPU
A possible logical structure of main memory (word-
addressable), which is not how MIPS actually works:
14Kevin McDonnell Stony Brook University CSE 220
Operands: Memory
Most computer architectures cannot read individual bits
from memory
Rather, the architectures instruction set can process only
entire words or individual bytes
If the smallest unit of data we can read from memory is a
word, we say that the memory is word-addressable
In this case, memory addresses would be assigned
sequentially, as in the previous figure
The MIPS architecture, in contrast, is byte-addressable
Each byte has its own memory address
15Kevin McDonnell Stony Brook University CSE 220
Operands: Memory
Byte-addressable memory (what MIPS uses)
16Kevin McDonnell Stony Brook University CSE 220
Reading Word-Addressable Memory
A memory read is called a load
Mnemonic: load word (lw)
Format: lw $s0, 16($t1)
Address calculation:
add base address ($t1) to the offset (16)
so a register first needs to have the base address that we
want to add to the offset
effective address = ($t1 + 16)
Result: $s0 holds the value at address ($t1 + 16)
Any register may be used to hold the base address
5
17Kevin McDonnell Stony Brook University CSE 220
Reading Word-Addressable Memory
Example: suppose we want to read a word of data at
memory address 8 into $s3
address = ($0 + 8) = 8
So what we want is for
$s3 to hold 0x01EE2842
Assembly code:
# read memory
#word 2 into $s3
lw $s3, 8($0)
18Kevin McDonnell Stony Brook University CSE 220
Writing Word-Addressable Memory
A memory write is called a store
Mnemonic: store word (sw)
Example: suppose we wanted to write (store) the value in
register $t4 into memory address 8
offset for loads and stores can be written in decimal
(default) or hexadecimal
add the base address ($0) to the offset (0x8)
address: ($0 + 0x8) = 8
Assembly code:
sw $t4, 0x8($0)# write the value in
# $t4 to memory addr 8
19Kevin McDonnell Stony Brook University CSE 220
Big-Endian & Little-Endian Memory
Each 32-bit word has 4 bytes. How should we number the
bytes within a word?
Little-endian: byte numbers start at the little (least
significant) end
Big-endian: byte
numbers start at the big
(most significant) end
LSB = least significant
byte; MSB = most
significant byte
Word address is the
same in either case
20Kevin McDonnell Stony Brook University CSE 220
Big-Endian & Little-Endian Memory
Suppose $t0 initially contains 0x23456789
After following code runs, what value is in $s0?
sw $t0, 0($0)
lb $s0, 1($0)
Big-endian:0x00000045
Little-endian: 0x00000067
The MIPS simulator we will use is little-endian
6
21Kevin McDonnell Stony Brook University CSE 220
Byte-Addressable Memory
Each data byte has a unique address
Load/store words or single bytes: load byte (lb) and store
byte (sb)
32-bit word = 4 bytes,
so word addresses
increment by 4
So when performing a
lw or sw, the effective
address must be a
multiple of 4
22Kevin McDonnell Stony Brook University CSE 220
Byte-Addressable Memory
When loading a byte, what do we do with the other 24 bits
in the 32-bit register?
lb sign-extends to fill the upper 24 bits
Suppose the byte loaded is zxxx xxxx 8 bits
The bit z is copied into the upper 24 bits
Normally with characters do not want to sign-extend the
byte, but rather prepend zeroes
This is called zero-extension
MIPS instruction that does zero-extension when loading
bytes:
load byte unsigned: lbu
23Kevin McDonnell Stony Brook University CSE 220
Reading Byte-Addressable Memory
The address of a memory word must be a multiple of 4
For example,
the address of memory word #2 is 2 4 = 8
the address of memory word #10 is 10 4 = 40 (0x28)
So do not forget this: MIPS is byte-addressed, not word-
addressed!
To read/write a word from/to memory, your lw/sw
instruction must provide an effective address that is
word-aligned
24Kevin McDonnell Stony Brook University CSE 220
Instruction Formats
4. Good design demands good compromises
Multiple instruction formats allow flexibility
add, sub:use 3 register operands
lw, sw: use 2 register operands and a constant
Number of instruction formats kept small
to adhere to design principles 1 and 3 (simplicity
favors regularity and smaller is faster)
7
25Kevin McDonnell Stony Brook University CSE 220
Instruction Formats
lw and sw use constants or immediates
Immediately available from instruction
The immediate value is stored in the instruction as a 16-bit
twos complement number
addi: add immediate
Is subtract immediate (subi) necessary?
Java code: MIPS assembly code:
# $s0 = a, $s1 = b
a = a + 4; addi $s0, $s0, 4
b = a 12; addi $s1, $s0, -12
26Kevin McDonnell Stony Brook University CSE 220
Machine Language
Binary representation of instructions
Computers only understand 1s and 0s
32-bit instructions
Simplicity favors regularity: 32-bit data & instructions
3 instruction formats:
R-Type: register operands (register-type instruction)
I-Type: immediate operand (immediate-type
instruction)
J-Type: for jumping (jump-type instruction) more on
this later
27Kevin McDonnell Stony Brook University CSE 220
R-Type Instructions
3 register operands:
rs, rt: source registers
rd: destination register
Other fields:
op: the operation code or opcode (0 for R-type)
funct: the function; with opcode, tells CPU what
operation to perform
shamt: the shift amount for shift instructions;
otherwise its 0
28Kevin McDonnell Stony Brook University CSE 220
R-Type Examples
Note the order of registers in the assembly code:
add rd, rs, rt
8
29Kevin McDonnell Stony Brook University CSE 220
I-Type Instructions
3 operands:
rs, rt: register operands
imm: 16-bit twos complement immediate
Other fields:
op: the opcode
Simplicity favors regularity: all instructions have opcode
Operation is completely determined by opcode
30Kevin McDonnell Stony Brook University CSE 220
I-Type Examples
Note the differing
order of registers in
assembly and machine
codes:
addi rt, rs, imm
lw rt, imm(rs)
sw rt, imm(rs)
31Kevin McDonnell Stony Brook University CSE 220
J-Type Instructions
26-bit address operand (addr)
Used for jump instructions (j)
if-statements, loops, functions
32Kevin McDonnell Stony Brook University CSE 220
Review: Instruction Formats
9
33Kevin McDonnell Stony Brook University CSE 220
Power of the Stored Program
32-bit instructions and data are stored in memory
To run a new program:
No rewiring required
Simply store new program in memory
Affords general purpose computing
Program execution:
Processor fetches (reads) instructions from memory in
sequence
Processor performs the specified operation and fetches
the next instruction
34Kevin McDonnell Stony Brook University CSE 220
Interpreting Machine Code
Start with opcode: tells how to parse the rest
If opcode all 0s we have an R-type instruction
Function bits (funct) indicate operation
Otherwise, opcode tells operation
35Kevin McDonnell Stony Brook University CSE 220
Interpreting Machine Code
36Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Programming
Theres a lot more to the MIPS instruction set still to cover,
but we (almost) know enough now to write some simple
programs that do computations
Every statement is divided into fields:
[Label:] operation [operands] [#comment]
Parts in square brackets are optional
A label is a sequence of alphanumeric characters,
underscores and dots. Cannot begin with a number. Ends
with a colon.
After the assembler has assembled (processed) your
code, the label refers to the address of where the line of
MIPS code is stored in memory
10
37Kevin McDonnell Stony Brook University CSE 220
MIPS Memory Layout
Data Segment (static)
Dynamic Data
Text Segment
(program)
Reserved
(for OS functions)
Grows
this way
Memory addresses (byte addresses)
0x7FFFFFFC
0x1000FFFF
0x10000000
0x00000000
0x00400000
In MARS, static data starts at
0x10010000 and dynamic
data starts at 0x10040000.
Stack Segment
Grows
this way
38Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Programming
The main label indicates the start of a program
Labels are also used to give names to locations in memory
where we want to store data (we will see this shortly)
Assembly programs also include assembler directives,
which start with a dot and give commands to the
assembler, but are not assembly language instructions
.text: beginning of the text segment
.data : beginning of data segment
.asciiz: declares an ASCII string terminated by NULL
.ascii: an ASCII string, not terminated by NULL
.word: allocates space for one or more 32-bit words
.globl: the name that follows is visible outside the file
39Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Programming
Strings (.asciiz and.ascii) are enclosed in quotes
They recognize escape sequences:
, t, , r, etc.
Finally, we need some way of doing basic input and output
The computers architecture does not handle those
responsibilities, but relies on the operating system
A system call is a request made by the program for the OS
to perform some service, such as to read input, print
output, quit the program, etc.
In our MIPS assembly programs we write syscall to
perform a system call
We have to include a numerical code (loaded into $v0) to
indicate the service requested
40Kevin McDonnell Stony Brook University CSE 220
MIPS System Calls
Service
System
Call Code
Arguments Result
print_int 1 $a0=integer
print_float 2 $f12=float
print_double 3 $f12=double
print_string 4 $a0=string
read_int 5 integer (in $v0)
read_float 6 float (in $f0)
read_double 7 double (in $f0)
read_string 8 $a0=buffer,
$a1=length
sbrk 9 $a0=amount
exit 10
11
41Kevin McDonnell Stony Brook University CSE 220
MIPS System Calls
sbrk allocates memory in the heap (e.g., large chunks of
memory)
These are the original MIPS system calls
The SBU MARS simulator has a few custom ones you will
learn about later. These system calls are not available in the
vanilla version of MARS publicly available on the web.
42Kevin McDonnell Stony Brook University CSE 220
Generating Constants
16-bit constants using addi:
Java code: MIPS assembly code:
// int is a 32-bit # $s0 = a
// signed word addi $s0, $0, 0x4f3c
int a = 0x4f3c;
32-bit constants use load upper immediate (lui) and ori
(more on ori in a few minutes):
Java code: MIPS assembly code:
int a = 0xFEDC8765; # $s0 = a
lui $s0, 0xFEDC
ori $s0, $s0, 0x8765
43Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Pseudoinstructions
MIPS implements the RISC concept
Relatively few, simple instructions
But there are some operations that assembly language
programmers need to do frequently that are not so natural
to write in native MIPS assembly instructions
These instructions can instead be written as a single
pseudoinstruction
Example: to load a 32-bit integer into a register requires
lui and ori
Instead we can use the li (load immediate)
pseudoinstruction
Example: li $v0, 4# loads 4 into $v0
44Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Pseudoinstructions
Another useful pseudoinstruction is la (load address)
Example: assume that str is a label (i.e., a memory
address)
la $a0, str # loads addr of str into $a0
The move pseudoinstruction copies the contents of one
register to another
move $1, $2 # equiv to add $1, $2, $0
12
45Kevin McDonnell Stony Brook University CSE 220
MIPS Program: Hello World
No introduction to a new programming language would be
complete without the obligatory hello world program
Lets see how this is done in MIPS
46Kevin McDonnell Stony Brook University CSE 220
Multiplication and Division
Special registers: lo, hi
32-bit 32-bit multiplication produces a 64-bit result
mult $s0, $s1
Result in {hi, lo}
32-bit division produces a 32-bit quotient and a 32-bit
remainder
div $s0, $s1
Quotient in lo
Remainder in hi
Instructions to move values from lo/hi special registers
mflo $s2
mfhi $s3
47Kevin McDonnell Stony Brook University CSE 220
MIPS Assembly Pseudoinstructions
Another useful pseudoinstruction that will help us write up
a program:
mul d, s, t# d = s * t
mul d, s, t is equivalent to:
mult s, t
mflo d
Similar pseudoinstruction for div
48Kevin McDonnell Stony Brook University CSE 220
MIPS Program: Compute + +
For the first version of this program we will hard-code the
values for the three coefficients and x
Major steps of our program:
1. Load values of A, B, C and x from memory into registers
2. Compute + +
Requires 5 total arithmetical operations
3. Print the result with an appropriate message
Requires several system calls
I am going to comment nearly every single line of MIPS
assembly code I write. You should do the same on your
homework!
13
49Kevin McDonnell Stony Brook University CSE 220
MIPS Program: Compute + +
For version 2 of the program we will add prompts to ask
the user to enter the values for x, A, B and C
Each prompt requires two system calls:
One to print the prompt message on the screen (if one is
desired/required)
Another to read the input
50Kevin McDonnell Stony Brook University CSE 220
Logical Instructions
and, or, xor, nor
and: useful for masking bits
Masking out (excluding) all but the least significant byte
of a value: 0xF234012E AND 0x000000FF = 0x0000002E
Why? Lets see:
0xF234012E AND 0x000000FF
1111 0010 0011 0100 0000 0001 0010 1110
0000 0000 0000 0000 0000 0000 1111 1111
0000 0000 0000 0000 0000 0000 0010 1110
0000002E
51Kevin McDonnell Stony Brook University CSE 220
Logical Instructions
or: useful for combining bit fields
Combine 0xF2340000 with 0x000012BC:
0xF2340000 OR 0x000012BC = 0xF23412BC
Written as bits:
0xF2340000 OR 0x000012BC
1111 0010 0011 0100 0000 0000 0000 0000
0000 0000 0000 0000 0001 0010 1011 1100
1111 0010 0011 0100 0001 0010 1011 1100
F23412BC
52Kevin McDonnell Stony Brook University CSE 220
Logical Instructions
nor: useful for inverting bits
A NOR $0 = NOT A
andi, ori, xori
16-bit immediate is zero-extended (not sign-extended)
nori not needed (can use ori and nor)
Examples in a moment
14
53Kevin McDonnell Stony Brook University CSE 220
Logical Instructions Examples
54Kevin McDonnell Stony Brook University CSE 220
Logical Instructions Examples
55Kevin McDonnell Stony Brook University CSE 220
Shift Instructions
Allow you to shift the value in a register left or right by up to
31 bits
sll: shift left logical
Example: sll $t0, $t1, 5# $t0 = $t1 << 5 Shifts bits to the left, filling least significant bits with zeroes srl: shift right logical Example: srl $t0, $t1, 5# $t0 = $t1 >>> 5
Shifts zeroes into most significant bits
sra: shift right arithmetic
Example: sra $t0, $t1, 5# $t0 = $t1 >> 5
Shifts sign bit into most significant bits
56Kevin McDonnell Stony Brook University CSE 220
Shift Instructions Examples
15
57Kevin McDonnell Stony Brook University CSE 220
Variable-Shift Instructions
These R-type instructions shift bits by number in a register
sllv: shift left logical variable
sllv rd, rt, rs (note: rs and rt reversed)
rt has value to be shifted
5 least significant bits of rs give amount to shift (0-31)
Example: sllv $t0, $t1, $t2 # $t0 = $t1 << $t2 srlv: shift right logical variable Example: srlv $t0, $t1, $t2 # $t0 = $t1 >>> $t2
srav: shift right arithmetic variable
Example: srav $t0, $t1, $t2 # $t0 = $t1 >> $t2
shamt field is ignored
58Kevin McDonnell Stony Brook University CSE 220
Variable-Shift Instructions Examples
59Kevin McDonnell Stony Brook University CSE 220
Rotate or Circular Shift
Bits are not lost when we rotate them (i.e., do a circular
shift)
They wrap around and enter the register from the other
end
These are pseudo-instructions:
rol: rotate left
ror: rotate right
Example: rol $t2, $t2, 4
Rotate left bits of $t2 by 4 positions:
1101 0010 0011 0100 0101 0110 0111 1000
0010 0011 0100 0101 0110 0111 1000 1101
60Kevin McDonnell Stony Brook University CSE 220
Applications of Bitwise Operators
The bitwise operations are useful in situations where we
have a set of Yes/No properties and using many Boolean
variables would waste memory
Example: file access flags in Unix/Linux
Network protocols: packets have very specific formats,
which may include many bits that need to be extracted to
determine how to process a packet
Compression algorithms sometime work on a bit-by-bit
basis
Implementing a mathematical set of values: item is
present in the set if bit i is 1; not present if the bit is 0
16
61Kevin McDonnell Stony Brook University CSE 220
Bitwise Operator Examples
Suppose we want to isolate byte 0 (rightmost 8 bits) of a
word in $t0. Simply use this:
andi $t0, $t0, 0xFF
0001 0010 0011 0100 0101 0110 0111 1000
0000 0000 0000 0000 0000 0000 0111 1000
62Kevin McDonnell Stony Brook University CSE 220
Bitwise Operator Examples
Suppose we want to isolate byte 1 (bits 8 to 15) of a word
in $t0. (Bits are numbered right-to-left.) We can use:
andi $t0,$t0,0xFF00
but then we still need to do a logical shift to the right by 8
bits. Why? To move the byte we have isolated into byte 0
and also to set bytes 1, 2 and 3 to all zeroes.
Could use instead:
sll $t0,$t0,16 *
srl $t0,$t0,24 **
0001 0010 0011 0100 0101 0110 0111 1000
0101 0110 0111 1000 0000 0000 0000 0000 *
0000 0000 0000 0000 0000 0000 0101 0110 **
63Kevin McDonnell Stony Brook University CSE 220
Bitwise Operator Examples
In binary, multiplying by 2 is same as shifting left by 1 bit:
11 10 = 110
Multiplying by 4 is same as shifting left by 2 bits:
11 100 = 1100
Multiplying by 2 is same as shifting left by n bits
Since shifting may be faster than multiplication, a good
compiler (e.g., C or Java) will recognize a multiplication by
a power of 2 and compile it as a shift:
a = a8; would compile assll $s0,$s0,3
Likewise, shift right to do integer division by powers of 2.
Remember to use sra, not srl. Why?
64Kevin McDonnell Stony Brook University CSE 220
MIPS Programming Tips
Initialize all your variables as needed (e.g., use li)
The MARS simulator fills the memory with zeroes, but
this is merely a convenience and luxury
When we test your homework programs, we may fill the
registers and main memory with garbage to make sure
you initialize registers with values!
Use the MARS debugger to fix problems with your code
The Registers view (on right) is especially useful
Use $s0-$s7 for local variables, and $t0-$t9 for
temporary values, such as intermediate results
We will see just how important this distinction is when
we study functions!
Reviews
There are no reviews yet.