Lecture 6:
Arithmetic 2/3
John Owens
Introduction to Computer Architecture UC Davis EEC 170, Winter 2021
Cascading Adders
Cascade Full Adders to make multibit adder:
A+B=S
A B
Cin
A B
Cin
A B
Cin
A B
Cin
Cout Sum
Cout Sum
Cout Sum
Cout Sum
A3
Full Adder
B3
A2
S3
Full Adder
B2
S2
A1
Full Adder
B1
S1
A0
Full Adder
B0
S0
40
UC Davis EEC 170, Winter 2021 / John Owens
But What About Performance? Critical path of one bitslice is CP
Critical path of n-bit rippled-carry adder is n*CP
Thought experiment:
@ 7 nm: FO4 is ~2.5 ps
2 gate delays * 64b = 320 ps
4 GHz clock period is 250 ps
Design Trick:
Throw hardware at it
A0 B0
A1 B1
A2 B2
A3 B3
CarryIn0
1-bit ALU
CarryOut0
Result0
Result1
Result2
Result3
CarryIn1
1-bit ALU
CarryOut1
CarryIn2
1-bit ALU
CarryOut2 CarryIn3
CarryOut3
41
UC Davis EEC 170, Winter 2021 / John Owens
1-bit ALU
Truth Table for Adder Bit Slice 3 inputs (A, B, Cin); 2 outputs (Sum, Cout)
A
B
Cin
Sum
Cout
0
0
0
0
0
0
0
1
1
0
0
1
0
1
0=Cin
0
1
1
0
1=Cin
1
0
0
1
0=Cin
1
0
1
0
1=Cin
1
1
0
0
1
1
1
1
1
1
42 UC Davis EEC 170, Winter 2021 / John Owens
A0 B0
A1 B1
A2 B2
A3 B3
Carry Look Ahead (Design trick: peek)
C0 = Cin
S
S
S
S
C4 = . . .
G
P
G = A and B P = A xor B WHY are these interesting?
43
UC Davis EEC 170, Winter 2021 / John Owens
A B 0 0
0 1
1 0
1 1
Cout
0 kill
Cin propagate Cin propagate 1 generate
G P
G P
C2 = G1 + G0 P1 + C0 P0 P1
C1 = G0 + C0 P0
G P
C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2
G P
CLA vs. Ripple
C0 = Cin
S
G = A and B P = A xor B
A0 B0
CarryIn0 A0
Result0
Result1
Result2
Result3
G P
B0 A1 A1
B1
B1
A2 B2
1-bit ALU
C1 = G0 + C0 P0
C2 = G1 + G0 P1 + C0 P0 P1
C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2
G
P
CarryOut0 CarryIn1
S
1-bit ALU
G P
A2
A3 B3
CarryOut1 CarryIn2
S
1-bit ALU
G P
B2
A3 B3
CarryOut2 CarryIn3
1-bit ALU
CarryOut3
G P
S
C4 = . . .
44
UC Davis EEC 170, Winter 2021 / John Owens
Cascaded Carry Look-ahead (16-bit)
C
L C0
Abstraction!
Good exercise to work
A
4-bit Adder
4-bit Adder
4-bit Adder
G0 P0
C1 = G0 + C0 P0
out what G and P are for a 4-bit adder
C2 = G1 + G0 P1 + C0 P0 P1
C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2
G P
C4 = . . .
45
UC Davis EEC 170, Winter 2021 / John Owens
Design Trick: Guess (or Precompute)
CP(2n) = 2*CP(n)
n-bit adder
n-bit adder
CP(2n) = CP(n) + CP(mux)
n-bit adder
1
n-bit adder
0
n-bit adder
Cout
Carry-select adder
46
UC Davis EEC 170, Winter 2021 / John Owens
Carry Skip Adder: reduce worst case delay
B A4 B A0
P3 S P3 S
4-bit Ripple Adder
4-bit Ripple Adder
P2
P1
P0
P2
P1
P0
Just speed up the slowest case for each block
Exercise: optimal design uses variable block sizes (why?)
47 UC Davis EEC 170, Winter 2021 / John Owens
Adder Lessons
Reuse hardware if possible
+/- reuse is compelling argument for 2s complement
For higher performance:
Look for critical path, optimize for it Reorganize equations [propagate/generate / carry lookahead]
Precompute [carry save]
Reduce worst-case delay [carry skip]
48 UC Davis EEC 170, Winter 2021 / John Owens
Multiply (unsigned)
Paper and pencil example (unsigned):
Multiplicand 1000 Multiplier x1001 1000
0000 0000
1000 Product 01001000
m bits x n bits = m+n bit product
Binary makes it easy:
0 => place 0 ( 0 x multiplicand )
1 => place a copy ( 1 x multiplicand )
4 versions of multiply hardware & algorithm (see book):
successive refinement
49 UC Davis EEC 170, Winter 2021 / John Owens
3.3 Multiplication
m bits x n bits = m+n bit product
How do we store a 128b result from a 64b x 64b multiply?
50 UC Davis EEC 170, Winter 2021 / John Owens
Unsigned Combinational Multiplier
0000
A3 A2 A1 A0
A3 A2
A3 A2
A3 A2
P7 P6 P5 P4 P3 P2 P1 P0
B0 B1
B2 B3
A1
A0
A1
A0
A1
A0
51 UC Davis EEC 170, Winter 2021 / John Owens
Unsigned Combinational Multiplier
At each stage shift A left (multiply it by 2)
Use next bit of B to determine whether to add in shifted multiplicand (Stage i A3 accumulates A * 2i if Bi == 1)
A3 A2
0000
A3 A2 A1 A0 A2
B0 B1
B2 B3
A1
A0
A1
A0
Accumulate 2n bit partial product at each stage
A3 A2
A1
A0
Q: How much
hardware for 32 bit multiplier? Critical path?
P7 P6 P5 P4 P3 P2 P1 P0
52 UC Davis EEC 170, Winter 2021 / John Owens
Unsigned shift-add multiplier (version 1)
64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg
Multiplicand
64 bits
Shift Left
Multiplier
32 bits
Control
Shift Right
64-bit ALU
Product
64 bits
Write
Multiplier = datapath + control
53
UC Davis EEC 170, Winter 2021 / John Owens
Multiply Algorithm V1
Product
0000 0000 0011
1:0000 0010 0011 2:0000 0010 0011 3:0000 0010 0001 1:0000 0110 0001 2:0000 0110 0001 3:0000 0110 0000
0000 0110 0000
Start
1. Test Multiplier0
Multiplier Multiplicand 0000 0010
Multiplier0 = 1
Multiplier0 = 0
0000 0010
0000 0100
0000 0100 0000 0100 0000 1000 0000 1000
0000 1000
1a. Add multiplicand to product & place the result in Product register
54
UC Davis EEC 170, Winter 2021 / John Owens
2. Shift the Multiplicand register left 1 bit.
3. Shift the Multiplier register right 1 bit.
32nd repetition?
No: < 32 repetitionsYes: 32 repetitions DoneHow can we make this more efficient? Remember original combinational multiplier:0000A3 A2 A1 A0A3 A2A3 A2A3 A2P7 P6 P5 P4 P3 P2 P1 P0B0 B1B2 B3 A1A0A1A0A1A055 UC Davis EEC 170, Winter 2021 / John Owens Simply warp to let product move right…0000A3 A2 A1 A0A3 A2 A1 A0A3 A2 A1 A0A3 A2 A1 A0B0B1B2B3P7 P6 P5 P4 P3 P2 P1 P0 Multiplicand stays still and product moves right56 UC Davis EEC 170, Winter 2021 / John Owens Multiply Hardware Version 3 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, (0-bit Multiplier reg)- Your book has details on thisMultiplicand 32 bits32-bit ALUProduct (Multiplier) 64 bitsShift RightWriteControl57UC Davis EEC 170, Winter 2021 / John Owens Faster MultiplierUses multiple adders- Cost/performance tradeoff Can be pipelined- Several multiplications performed in parallel58 UC Davis EEC 170, Winter 2021 / John Owens Motivation for Booths Algorithm Example 2 x 6 = 0010 x 0110:0010 x 0110+ + + +0000 shift (0 in multiplier) 0010 add (1 in multiplier) 0010 add (1 in multiplier)0000 shift (0 in multiplier) 000110059 UC Davis EEC 170, Winter 2021 / John Owens Motivation for Booths Algorithm ALU with add or subtract can get same result in more than one way: 6 = 4 + 2 = -2 + 80110 = -00010 + 01000 = 11110 + 01000 For example:0010 (2 [multiplier]) x 0110 (6 [multiplicand])0000 – 00100000 + 001000001100shift (0 in multiplier)sub (first 1 in multpl.)shift (mid string of 1s)add (prior step had last 1)60 UC Davis EEC 170, Winter 2021 / John Owens Booths Algorithm end of run Current Bit1 0 1 1 0 1 0 0011110Bit to the Right ExplanationBegins run of 1sMiddle of run of 1sEnd of run of 1sExample Op 0001111000 sub 0001111000 none 0001111000 add 0001111000 noneMiddle of run of 0s Originally for speed (when shift was faster than add)middle of run Replace a string of 1s in multiplier with an initial subtract when we first see a one and then later add for the bit after the last one Handles twos complement!beginning of run61 UC Davis EEC 170, Winter 2021 / John Owens Booths Example (2 x 7)Big picture: Treat 7 as 8 – 1Operation0. initial value1a. P = P – m 1b.2.3.4a. 4b.Multiplicand 0010Product0000 0111 0 1110 0111 0 1111 0011 1 1111 1001 1 1111 1100 1 0001 1100 1 0000 1110 0next?10 -> sub
shift P (sign ext) 11 -> nop, shift 11 -> nop, shift 01 -> add
shift done
Blue + red = multiplier
(red is 2 bits to compare); Green = arithmetic ops; Black = product
62 UC Davis EEC 170, Winter 2021 / John Owens
0010
0010
0010
0010
0010
0010
+ 1110->
+ 0010->
Booths Example (2 x -3)
Operation
0. initial value
1a. P = P -m 1b.
2a.
2b.
3a.
3b. 4a.
4b.
Blue + red = multiplier (red is 2 bits to compare); Green = arithmetic ops; Black = product
Multiplicand 0010
1110 + 1110 0010
0010
+ 1110
0010 0010
Product
0000 1101 0 1110 1101 0
01
next?
10 -> sub
shift P (sign ext) 01 -> add + 0010
1111 011
0001 011 0000 1011 0
01 shift P
10 -> sub shift
1110 1011 0
1111 0101 1 11 -> nop
0010
1111 0101 1 1111 1010 1
shift done
63
UC Davis EEC 170, Winter 2021 / John Owens
Radix-4 Modified Booths Algorithm
Current Bits
00 01 10 11
00 01 10 11
Bit to the Right
0 0 0 0
1 1 1 1
Explanation Example Recode
Middle of zeros Single one Begins run of 1s Begins run of 1s
Ends run of 1s Ends run of 1s Isolated 0 Middle of run
00 00 00 00 00 00 00 00 01 00 00 01 11 10 00 00 01 11 11 00
00 00 11 11 00 00 01 11 11 00 00 11 10 11 00 00 11 11 11 00
0
1 -2 -1
1
2 -1 0
Same insight as one-bit
Allows multiplication 2 bits at a time.
Booths, simply adjust for alignment of 2 bits.
64
UC Davis EEC 170, Winter 2021 / John Owens
RISC-V Multiplication Support
Four multiply instructions: mul: multiply
Gives the lower 64 bits of the product mulh: multiply high
Gives the upper 64 bits of the product, assuming the operands are signed
mulhu: multiply high unsigned
Gives the upper 64 bits of the product, assuming the operands are unsigned
mulhsu: multiply high signed/unsigned
Gives the upper 64 bits of the product, assuming one operand is signed and the other unsigned
Use mulh result to check for 64-bit overflow
65 UC Davis EEC 170, Winter 2021 / John Owens
RISC-V Support for multiply
If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2 (source register specifiers must be in same order andrdhcannot be the same asrs1or
rs2). Microarchitectures can then fuse these into a single multiply operation instead of performing two separate multiplies.
66 UC Davis EEC 170, Winter 2021 / John Owens
Multiplication Summary
Iterative algorithm Design techniques:
Analyze hardwarewhats not in use?
Spend more hardware to get higher performance
Booths Algorithmmore general (2s complement)
Booths Algorithmrecoding is powerful technique to think about problem in a different way
Booths Algorithmmore bits at once gives higher performance
67 UC Davis EEC 170, Winter 2021 / John Owens
RISC-V Support for divide
4 instructions:
-{div, divu, rem, remu} rd, rs1, rs2
If both the quotient and remainder are required from the same division, the recommended code sequence is: DIV[U] rdq, rs1, rs2; REM[U] rdr,
rs1, rs2 (rdq cannot be the same as rs1 or rs2). Microarchitectures can then fuse these into a single divide operation instead of performing two separate divides.
Overflow and division-by-zero dont produce errors
Just return defined results
Faster for the common case of no error
div: rs1 / rs2, treat as signed
divu: rs1 / rs2, treat as unsigned
rem: rs1 mod rs2, treat as signed
remu: rs1 mod rs2, treat as unsigned
72 UC Davis EEC 170, Winter 2021 / John Owens
MIPS Support for multiply/divide
Rather than target the general-purpose registers:
hi
mul placed its output into two specialhiandloregisters div placed its divide output intoloand its rem output into
general-purpose register)
MIPS providedmfloandmfhiinstructions (destination:
73 UC Davis EEC 170, Winter 2021 / John Owens
Divide: Paper & Pencil
1001 Quotient Divisor 1000 1001010 Dividend
100100
101
1010
1000
10 Remainder
(or Modulo result)
See how big a number can be subtracted, creating quotient bit on each step
Binary => 1 * divisor or 0 * divisor Dividend = Quotient x Divisor + Remainder 3 versions of divide, successive refinement
74 UC Davis EEC 170, Winter 2021 / John Owens
Divide Hardware Version 1
64-bit Divisor reg, 64-bit ALU, 64-bit Remainder reg, 32-bit Quotient reg
Divisor
64 bits
Shift Right
64-bit ALU
Remainder
64 bits
Write
Quotient
32 bits
Control
Shift Left
75
UC Davis EEC 170, Winter 2021 / John Owens
Divide Algorithm V1
Takes n+1 steps for n-bit Quotient & Rem.
Start: Place Dividend in Remainder
Remainder Quotient 0000 0111 0000
Divisor 0010 0000
Remainder 0
1. Subtract the Divisor register from the Remainder register, and place the result in the Remainder register.
Test Remainder
Remainder < 0 2a. Shift the Quotient register to the left setting the new rightmostbit to 1.2b. Restore the original value by adding the Divisor register to the Remainder register, & place the sum in the Remainder register. Also shift the Quotient register to the left, settingthe new least significant bit to 0.restoring division 3. Shift the Divisor register right1 bit.Yes: n+1 repetitions (n = 4 here)Donen+1 repetition?No: < n+1 repetitions76UC Davis EEC 170, Winter 2021 / John Owens Divide Algorithm I example (7 / 2)Remainder QuotientDivisor 0000 0111 1:1110 0111 000002:0000 0111 0000 3:0000 0111 0000 1:1111 0111 00002:0000 0111 000000000010 0000 0010 00000010 00000001 00000001 00000001 00000000 10000000 10000000 10000000 01000000 01000000 01000000 00100000 00100000 00100000 0001Answer: Quotient = 3 Remainder = 100 0 000 03:0000 0111 000 1:11111111 0002:0000 0111 0000 0003:0000 0111 00000 1:0000 0011 00000 0001 2:0000 0011 03:0000 0011 00001 0001 2:0000 0001 000113:0000 0001 000111:0000 0001 077UC Davis EEC 170, Winter 2021 / John Owens Divide Hardware Version 3 32-bit Divisor reg, 32 -bit ALU, 64-bit Remainder reg, (0-bit Quotient reg)Divisor32 bits32-bit ALUHI LOShift LeftWriteRemainder(Quotient)Control64 bits78UC Davis EEC 170, Winter 2021 / John Owens Final Multiply / Divide Hardware Multiplicand 32 bits 32-bit ALU64 bits32 bits32-bit ALUHIShift RightWriteControl Product (Multiplier) Divisor LOShift LeftWriteRemainder (Quotient) Control 64 bits 79UC Davis EEC 170, Winter 2021 / John Owens SRT DivisionD. Sweeney of IBM, J.E. Robertson of the University of Illinois, and T.D. Tocher of Imperial College, London CurrentRemainderP-D (Partial Divisor) Plot9|9 43211111 8|8 42211110 7|7 32111100 6|6 32111000 5 | 5 2 1 1 1 0 00 0 4 | 4 2 1 1 0 0 00 0 3|3 11000000 2|2 10000000 1|1 00000000 0 +——————0 1 2 3 4 5 6 78 9 80 UC Davis EEC 170, Winter 2021 / John Owens SRT Division Intel Pentium divide implementation: SRT division with 2 bits/iteration (radix 4) Allows negative entries 1066 entries in lookup table81UC Davis EEC 170, Winter 2021 / John Owens[http://members.cox.net/srice1/ pentbug/introduction.html] Faster Division Cant use parallel hardware as in multiplier- Subtraction is conditional on sign of remainder Faster dividers (e.g. SRT division) generate multiple quotient bits per step- Still require multiple steps82 UC Davis EEC 170, Winter 2021 / John Owens Division Lessons In practice, slower than multiplication- Also less frequent- But, in the simple case, can use same hardware! Generates quotient and remainder together Floating-point division faster than integer division (why?) Similar hardware lessons as multiplier:- Look for unused hardware- Can process multiple bits at once at cost of extra hardware83 UC Davis EEC 170, Winter 2021 / John Owens RISC-V logical instructions Instruction Meaning Pseudocode XORI rd,rs1,imm SRAI rd,rs1,immExclusive Or Immediaterd ux(rs1) ux(imm) ORI rd,rs1,imm Or Immediaterd ux(rs1) ux(imm) ANDI rd,rs1,imm SLLI rd,rs1,immAnd Immediaterd ux(rs1) ux(imm)Shift Left Logical Immediaterd ux(rs1) ux(imm) SRLI rd,rs1,imm Shift Right Logical Immediaterd ux(rs1) ux(imm)SLL rd,rs1,rs2SRL rd,rs1,rs2Shift Right Arithmetic Immediaterd sx(rs1) ux(imm)Shift Left Logicalrd ux(rs1) rs2 XOR rd,rs1,rs2 Exclusive Orrd ux(rs1) ux(rs2)Shift Right Logicalrd ux(rs1) rs2 SRA rd,rs1,rs2 OR rd,rs1,rs2Shift Right Arithmeticrd sx(rs1) rs2Orrd ux(rs1) ux(rs2) AND rd,rs1,rs2 Andrd ux(rs1) ux(rs2)84 UC Davis EEC 170, Winter 2021 / John Owens Manipulating Bits with Shift, Or, And Mask out exponent with andS EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM& 0 11111111 00000000000000000000000————————————0 EEEEEEEE 00000000000000000000000Right shift 23 bits to get000000000000000000000000 EEEEEEEEDo arithmetic manipulation000000000000000000000000 ENEWENEWLeft shift 23 bits to get0 ENEWENEW 00000000000000000000000Zero out old exponentS EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM& 1 00000000 11111111111111111111111————————————S 00000000 MMMMMMMMMMMMMMMMMMMMMMMOr in new exponentS 00000000 MMMMMMMMMMMMMMMMMMMMMMM| 0 ENEWENEW 00000000000000000000000————————————S ENEWENEW MMMMMMMMMMMMMMMMMMMMMMM85 UC Davis EEC 170, Winter 2021 / John Owens Shift Operations Arithmetic operation:- Example: 00011 << 2 [3 left shift 2]- 00011 << 2 = 01100 = 12 = 2 * 4- Each bit shifted left == multiply by two- Example: 01010 >> 1 [10 right shift 1]
01010 >> 1 = 00101 = 5 = 10/2
Each bit shifted right == divide by two
Why?
Compilers do thisstrength reduction
86 UC Davis EEC 170, Winter 2021 / John Owens
Shift Operations
With left shift, what do we shift in?
00011 << 2 = 01100 (arithmetic)- 0000XXXX << 4 = XXXX0000 (logical)- We shifted in zeroesHow about right shift?- XXXX0000 >> 4 = 0000XXXX (logical) Shifted in zero
00110 (= 6) >> 1 = 00011 (3) (arithmetic) Shifted in zero
11110 (= -2) >> 1 = 11111 (-1) (arithmetic) Shifted in one
87 UC Davis EEC 170, Winter 2021 / John Owens
Shift Operations
How about right shift?
XXXX0000 >> 4 = 0000XXXX: Logical shift
Shifted in zero
00110 (= 6) >> 1 = 00011 (3)
11110 (= -2) >> 1 = 11111 (-1): Arithmetic shift
Shifted in sign bit
RISC-V supports both logical and arithmetic:
slli, srai, srli: Shift amount taken from within instruction (imm)
6 bits 6 bits 5 bits 3 bits 5 bits
7 bits 5 bits 5 bits 3 bits 5 bits
sll, sra, srl: shift amount taken from register (variable)
How far can we shift with slli/srai/slli? With sll/sra/srl?
7 bits
7 bits
funct6
shamt
rs1
funct3
rd
opcode
funct7
rs2
rs1
funct3
rd
opcode
88
UC Davis EEC 170, Winter 2021 / John Owens
Combinational Shifter from MUXes
Basic Building Block
AB sel
D
8-bit right shifter
A7 A6 A5 A4 A3 A2 A1 A0
S2 S1 S0
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
What comes in the MSBs?
How many levels for 64-bit shifter? What if we use 4-1 Muxes ?
R7 R6 R5 R4 R3 R2 R1 R0
89 UC Davis EEC 170, Winter 2021 / John Owens
10
General Shift Right Scheme using 16 bit example
S0 (0,1)
S1 (0,2)
S2 (0,4)
S3 (0,8)
If we added right-to-left connections, we could supportROTATE (not in RISC-V but found in other ISAs)
90 UC Davis EEC 170, Winter 2021 / John Owens
Funnel Shifter Shift A by i bits
Problem: Set Y, X, sa Logical:
Arithmetic: Rotate:
Left shifts:
Extract 64 bits of 128 (sa selects which bits)
| sa |
Y
X
Y
64
X
64
R
Shift Right
91
UC Davis EEC 170, Winter 2021 / John Owens
64 R
Barrel Shifter
Technology-dependent solutions: transistor per switch
in6
in5
in4
out3
out2
out1
out0
SR3
SR2 SR1 SR0
in3
in2 in1 in0
92
UC Davis EEC 170, Winter 2021 / John Owens
Shifter Summary
Shifts common in logical ops, also in arithmetic RISC-V has:
2 flavors of shift: logical and arithmetic
2 directions of shift: right and left
2 sources for shift amount: immediate, variable
Lots of cool shift algorithms, but
Barrel shifter prevalent in todays hardware
93 UC Davis EEC 170, Winter 2021 / John Owens
Floating Point
Representation for non-integral numbers
Including very small and very large numbers
Like scientific notation
2.34 1056
+0.002 104
+987.02 109 In binary
1.xxxxxxx2 2yyyy
Typesfloatanddoublein C
normalized
not normalized
94
UC Davis EEC 170, Winter 2021 / John Owens
3.5 Floating Point
Floating Point Standard
Defined by IEEE Std 754-1985
Developed in response to divergence of representations
Portability issues for scientific code Now almost universally adopted
Two representations
Single precision (32-bit)
Double precision (64-bit)
95 UC Davis EEC 170, Winter 2021 / John Owens
96 UC Davis EEC 170, Winter 2021 / John Owens
Floating-point Formats
Single-precision (32 bits)
Double precision (64 bits)
97 UC Davis EEC 170, Winter 2021 / John Owens
IEEE Floating-Point Format
x = (1)S (1+Fraction)2(ExponentBias)
S: sign bit (0 non-negative, 1 negative)
Normalize significand: 1.0 |significand| < 2.0- Always has a leading pre-binary-point 1 bit, so no need torepresent it explicitly (hidden bit)- Significand is Fraction with the 1. restoredFraction: single: 23 bits double: 52 bits Exponent: excess representation: actual exponent + Bias- Ensures exponent is unsigned- Single: Bias = 127; Double: Bias = 102398 UC Davis EEC 170, Winter 2021 / John OwensExponent: single: 8 bits double: 11 bits Single-Precision Range Exponents 00000000 and 11111111 reserved Smallest value- Exponent: 00000001 actual exponent = 1 127 = 126- Fraction: 000…00 significand = 1.0- 1.0 2126 1.2 1038 Largest value- exponent: 11111110 actual exponent = 254 127 = +127- Fraction: 111…11 significand 2.0- 2.0 2+127 3.4 10+3899UC Davis EEC 170, Winter 2021 / John Owens Double-Precision Range Exponents 0000…00 and 1111…11 reserved Smallest value- Exponent: 00000000001 actual exponent = 1 1023 = 1022- Fraction: 000…00 significand = 1.0- 1.0 21022 2.2 10308 Largest value- Exponent: 11111111110 actual exponent = 2046 1023 = +1023- Fraction: 111…11 significand 2.0- 2.0 2+1023 1.8 10+308100UC Davis EEC 170, Winter 2021 / John Owens Floating-Point Precision Relative precision- all fraction bits are significant- Single: approx 223- Equivalent to 23 log102 23 0.3 6 decimal digits ofprecision- Double: approx 252- Equivalent to 52 log102 52 0.3 16 decimal digits of precision101UC Davis EEC 170, Winter 2021 / John Owens Floating-Point Example Represent 0.75 = (0.5 + 0.25) = (1.0 + 0.5) / 2- -0.75 = (1)1 1.12 21 – S =1Fraction = 1000…002 – Exponent = 1 + Bias- Single: 1 + 127 = 126 = 011111102- Double: 1 + 1023 = 1022 = 011111111102 Single: 1011111101000…00 Double: 1011111111101000…00102UC Davis EEC 170, Winter 2021 / John Owens Floating-Point Example What number is represented by the single-precision float 11000000101000…00- S =1–Fraction = 01000…002Fxponent = 100000012 = 129 x = (1)1 (1 + .012) 2(129 127)= (1) 1.25 22 = 5.0103UC Davis EEC 170, Winter 2021 / John Owens Denormal Numbers Exponent = 000…0 hidden bit is 0x=(1)S (0+Fraction)2Bias Smaller than normal numbers- allow for gradual underflow, with diminishing precision Denormal with fraction = 000…0x=(1)S (0+0)2Bias =0.0Two representations of 0.0!104UC Davis EEC 170, Winter 2021 / John Owens Infinities and NaNs Exponent = 111…1, Fraction = 000…0- Infinity- Can be used in subsequent calculations, avoiding need for overflow check Exponent = 111…1, Fraction = 000…0- Not-a-Number (NaN)- Indicates illegal or undefined result- e.g., 0.0 / 0.0- Can be used in subsequent calculations105UC Davis EEC 170, Winter 2021 / John Owens Half-precision Floating Pointhttps://en.wikipedia.org/wiki/Bfloat16_floating-point_format 106 UC Davis EEC 170, Winter 2021 / John Owens
Reviews
There are no reviews yet.