[Solved] CAA- Homework 5

$25

File Name: CAA-_Homework_5.zip
File Size: 141.3 KB

SKU: [Solved] CAA- Homework 5 Category: Tag:
5/5 - (1 vote)

1 Programming

The programming part is only for practice, you have no need to hand in this part of this homework.

But if you are interested in this part, it is free to email TAs to have some discussion.

In this homework, we are going to examine the cache effect. The tool well use is rocket-chip. You can either build rocket-chip yourself or use the image provided

docker pull ntuca2020/hw5 # size ~ 8.28G docker run name=test -it ntuca2020/hw5 cd /root ls

Folder structure for this homework:

emulator/ // link to rocket-chip emulator
| benchmarks/ // link to riscv-tests benchmark
| | Makefile // complie all benchmarks
| | qsort/ // qsort benchmark folder
| | qsort.riscv // riscv executable
| | qsort.riscv.dump // objdump riscv executable
| | mt-matmul/ // mt-matmul benchmark
| | mt-matmul.riscv // riscv executable
| | mt-matmul.riscv.dump // objdump riscv executable
| | mt-matmul_4/ // for part2
| | matmul.c < need to be handed in
| | mt-matmul_4.riscv // riscv executable

| | mt-matmul_4.riscv.dump // objdump riscv executable

| | // other benchmarks

| common

| |

| crt.S // specify number of cores available

| system/ // link to rocket-chip system

| | test.scala // first part SoC settings

| | HW5.scala < used for matrix multiplication and need to be handed in

| *.scala // other default scala settings

| build.sh // build all settings

| test.sh // test all settings

| spike_test.sh // can test on spike first

| Config1 // Configuration1

| generated-src_Config1 // Layout, RTL, mappings, dts, etc, for Config1 |

Makefile // Build the configuration

Part 1: Observing cache behavior

Run test.sh and fill in cycle counts for each benchmark and each setting in the following form

Answer the following questions (answers should be based your observation on the cache configurations and the program behavior)

  • Why are (1) the same or different?
  • Why are (2) the same or different?
  • Why are (3) the same or different?
  • Why are (4) the same or different?
  • Why are (5) the same or different?
  • See the pmp.c in /root/emulator/benchmarks/pmp, what does this program want to do? And how does it make it?
  • Change the number of cores available in crt.S file (line 125) in /root/emulator/benchmarks/common and recompile the mt-matmul program (for this question, matrix size is 3232).
    • Report the cycle count of configuration17 on 1-core, configuration19 on 2-core, and configuration20 on 4-core (1%)
    • Describe whether the cycle count decreases linearly, why or why not.
dhrystone median multiply qsort rsort towers vvadd
Configuration 1 (4) (3) (1)
Configuration 2 (1)
Configuration 3 (2),(3)
Configuration 4 (2)
Configuration 5
Configuration 6 (4)
Configuration 7 (4)
Configuration 8
Configuration 9
Configuration 10
Configuration 11
Configuration 12 (5)
Configuration 13 (5)

Tabelle 1: Benchmark on different configurations

Part 2: Cache and matrix multiplication

In this part, we revisit the matrix multiplication. You are asked to implement 6464 matrix multiplication on 4-core, 128-B L1-D$, 128-B L1-I$ (no L2). The size of cache is fixed so that you can only change way-set setting in L1.

Change the dataset in /root/emulator/benchmarks/mt-matmul/mt matmul.c to the one with 6464 (dataset2.h). The cache setting is specified in /root/emulator/system/HW5.scala and you can build the simulator using

make -j8 CONFIG=freechips.rocketchip.system.HW5Config

in /root/emulator.

The matrix multiplication program is located at /root/emulator/benchmarks/mt-matmul/matmul.c. Each thread will enter this function with its thread id and local storage (128KB) and exit once the task is finished. You may want to see the files under mt-matmul/ and common/.

The distribution of the workload and the cache behavior should be considered when you implement matrix multiplication. We will score based on the cycle count coming out from your HW5.scala and matmul.c.

Grading:

  • Correctness
  • Based on cycle count
    • Ranking: Top 5
    • Ranking: 620
    • Ranking: 2140
    • Ranking: 4180
    • Ranking: > 80
  • Report on how you make your matrix multiplication and maybe some cache miss rate statistics using spike

Architecture and Security (0%)

Although it is important to design a high-performance architecture, it is also crucial to design a secure architecture. Read the Spectre Attacks: Exploiting Speculative Execution (or you may want to reference the original paper here) and answer the questions.

  • How to perform exploiting conditional branch misprediction attack?
  • How to perform poisoning indirect branches attack?
  • How to mitigate Spectre Attacks? (at least 3 methods)

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CAA- Homework 5
$25