15-213 Recitation 7: Style, Valgrind, Blocking
Monday, October 11th, 2021
Logistics
Copyright By Assignmentchef assignmentchef
Code Reviews
Blocking
Valgrind / Intro to Git
Looking Ahead: Cache Lab
Cache lab is due tomorrow!
Come to office hours for help NO Midterm!
Code Reviews
Code Reviews
Why code reviews?
Used in industry Nearly all companies utilize code reviews
Systematic code reviews are highly effective at finding bugs
efficiently and effectively.
Code Reviews
Industry example from an embedded system machine critical pipeline flow device requiring high software quality
No code reviews / planned testing
The same team implemented testing and code reviews. This is a similar project done 5 years later.
Example from CMU 18-642 https://course.ece.cmu.edu/~ece642/ , Sourced from. 2005
Code Review Signup
All students in the course will receive an email with a link to signup for a code review timeslot.
All students will receive a final style score from 0-4 points
213 code reviews will be short (<= 15 minutes) and cover code styleand code quality.Code Style Properly document your code Function + File header comments, overall operation of large blocks, any tricky bits Write robust code check error and failure conditions Write modular code Use interfaces for data structures, e.g. create/insert/remove/free functions for a linked No magic numbers use #define or static const Formatting 80 characters per line (use Autolabs highlight feature to double-check) Consistent braces and whitespace No memory or file descriptor leaks Finding memory leaks – part of the style $ valgrind leak-resolution=high leak-check=fullshow-reachable=yes track-fds=yes ./myProgram arg1 arg Remember that Valgrind can be used for other things, like finding invalid reads and writes! Activity: ValgrindActivity Setup Split up into groups of 2-3 people One person needs a laptop Log in to a Shark machine, and type: $ wget https://www.cs.cmu.edu/~213/activities/rec7.tar $ tar -xvf rec7.tar 213_exam_answers.cExample: Matrix Multiplication /* multiply 4×4 matrices */void mm(int a[4][4], int b[4][4], int c[4][4]) {int i, j, k;for (i = 0; i < 4; i++)for (j = 0; j < 4; j++)for (k = 0; k < 4; k++)c[i][j] += a[i][k] * b[k][j]; Lets step through this to see whats actually happeningExample: Matrix Multiplication Assume a tiny cache with 4 lines of 8 bytes (2 ints) S = 1, E = 4, B = 8 Lets see what happens if we dont use blocking c[0][0] += a[0][0] * b[0][0]Key: Grey = accessedDark grey = currently accessing Red border = in cachec[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache00 10 20 300 0 0 1 0 2 0 3c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2 0 3 1 0c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][0] * b[0][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2 0 3 1 0 1 1c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] *b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2 0 3 1 0 1 1 1 2c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] *b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]Key: Grey = accessedDark grey = currently accessing Red border = in cacheWhat is the miss rate of a? 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]Key: Grey = accessedDark grey = currently accessing Red border = in cacheWhat is the miss rate of a? What is the miss rate of b?Example: Matrix Multiplication (blocking) /* multiply 4×4 matrices using blocks of size 2 */void mm_blocking(int a[4][4], int b[4][4], int c[4][4]) {int i, j, k;int i_c, j_c, k_c;int B = 2;// control loopsfor (i_c = 0; i_c < 4; i_c += B)for (j_c = 0; j_c < 4; j_c += B)for (k_c = 0; k_c < 4; k_c += B)// block multiplicationsfor (i = i_c; i < i_c + B; i++)for (j = j_c; j < j_c + B; j++)for (k = k_c; k < k_c + B; k++)c[i][j] += a[i][k] * b[k][j]; Lets step through this to see whats actually happeningExample: Matrix Multiplication (blocking) Assume a tiny cache with 4 lines of 8 bytes (2 ints) S = 1, E = 4, B = 8 Lets see what happens if we now use blocking c[0][0] += a[0][0] * b[0][0]Key: Grey = accessedDark grey = currently accessing Red border = in cachec[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache00 10 20 300 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0 1 1 0 0c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0 1 1 0 0 0 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0 1 1 0 0 0 1 1 0c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]Key: Grey = accessedDark grey = currently accessing Red border = in cache0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]c[0][0] += a[0][2] * b[2][0]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]8 0 9 0 10 0c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]80 90 100 110c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]8 0 9 0 10 0 11 0 12 1c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]iter i j k8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]iter i j k8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]iter i j k8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2 15 1 1 3c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1] c[1][1] += a[1][3] * b[3][1]0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]8 0 9 0 10 0 11 0 12 1 13 1 14 1 15 10 2 0 3 1 2 1 3c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1]What is the miss rate of a?0 3 1 2 1 3 c[1][1] += a[1][3] * b[3][1] What is the miss rate of b? Introduction to GitVersion control is your friend What is Git? Most widely used version control system out there Version control: Help track changes to your source code over time Help teams manage changes on shared codeGit Commands Clone: git clone
Add: git add . or git add
Push / Pull: git push / git pull
Commit: git commit -m your-commit-message
Good commit messages are key!
Bad:commit, change, fixed
Good: Fixed buffer overflow potential in AttackLab
If you get stuck
Reread the writeup
Look at CS:APP Chapter 6
Review lecture notes (http://cs.cmu.edu/~213)
Come to Office Hours (Sunday to Friday, 6:00-10:00 PM Locations on Piazza)
Post private question on Piazza
man malloc, man valgrind, man gdb
Practice Problems
Class Question / Discussions
Well work through a series of questions
Write down your answer for each question You can discuss with your classmates
What Type of Locality?
The following function exhibits which type of
locality? Consider only array accesses.
void who(int *arr, int size) {
for (int i = 0; i < size-1; ++i)arr[i] = arr[i+1];A. B. C. D.Spatial TemporalBoth A and B Neither A nor BWhat Type of Locality? The following function exhibits which type oflocality? Consider only array accesses.void who(int *arr, int size) {for (int i = 0; i < size-1; ++i)arr[i] = arr[i+1];A. B. C. D.Spatial TemporalBoth A and B Neither A nor BWhat Type of Locality? The following function exhibits which type oflocality? Consider only array accesses.void coo(int *arr, int size) {for (int i = size-2; i >= 0; i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
What Type of Locality?
The following function exhibits which type of
locality? Consider only array accesses.
void coo(int *arr, int size) {
for (int i = size-2; i >= 0; i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
Calculating Cache Parameters
Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.
Calculating Cache Parameters
Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.
Direct-Mapped Cache Example
Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset
Direct-Mapped Cache Example
Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset
Which Set Is it?
Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset
Which Set Is it?
Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset
Cache Block Range
What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C 0xFA23
0xFA1C 0xFA1F
0xFA18 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset
Cache Block Range
What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C 0xFA23
0xFA1C 0xFA1F
0xFA18 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset
Cache Misses
If N = 16, how many bytes does the loop access of a?
int foo(int* a, int N)
int sum = 0;
for(i = 0; i < N; i++)sum += a[i]; }return sum; }Accessed BytesA4 B 16 C 64 D 256Cache MissesIf N = 16, how many bytes does the loop access of a?int foo(int* a, int N)int sum = 0;for(i = 0; i < N; i++)sum += a[i]; }return sum; }Accessed BytesA4 B 16 C 64 D 256 Cache MissesConsider a 32 KB cache in CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.