5/5 - (1 vote)

1 Benchmarking Deep Neural Networks

Name: [Solved] IOC5009-Lab 2
Brand: Assignment Chef
SKU: [Solved] IOC5009-Lab 2
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

People often estimate the number of parameters and MAC operations to take insight of one neural network model. Results of this benchmarking help people to optimize the computation of neural networks and neural network hardware designs.

This lab requires you to implement each neural network layer of VGG 16 model through high level programming languages such as C/C++/python(numpy and scipy) etc.. You dont allow to using any external DNN libraries such as cuDNN, MKL-DNN or DNN frameworks such as pyTorch, TensorFlow and Keras in your implementation. You can follow VGG 16 model architecture model (See Table 1) to complete your forward pass implementation. You only require to run your implementation on the CPU. Furthermore, you also need to calculate the memory size of inputs, the number of parameters, and the number of MAC operations in each layer. The batch size of this VGG 16 model is 1. The activation function in CONV layer is ReLU. The size of the initial input is 224 x 224 x 3 and the inputs values are randomly generated. Finally, you need to fill your results in the VGG16 Benchmark Table (See Table 2) and turn your codes and completed table in.

Table 1: VGG 16 Model Architecture

	Input	Filter Size	# of channel	# of filter	Pool Size	stride	Activation
INPUT	224 x 224		3	3
CONV	224 x 224	3 x 3	64	64			ReLU
CONV	224 x 224	3 x 3	64	64			ReLU
MAXPOOL					2 x 2	2
CONV	112 x 112	3 x 3	128	128			ReLU
CONV	112 x 112	3 x 3	128	128			ReLU
MAXPOOL					2 x 2	2
CONV	56 x 56	3 x 3	256	256			ReLU
CONV	56 x 56	3 x 3	256	256			ReLU
CONV	56 x 56	3 x 3	256				ReLU
MAXPOOL					2 x 2	2
CONV	28 x 28	3 x 3	512	512			ReLU
CONV	28 x 28	3 x 3	512	512			ReLU
CONV	28 x 28	3 x 3	512	512			ReLU
MAXPOOL					2 x 2	2
CONV	14 x 14	3 x 3	512	512			ReLU
CONV	14 x 14	3 x 3	512	512			ReLU
CONV	14 x 14	3 x 3	512	512			ReLU
MAXPOOL					2 x 2	2
FC 4096
FC 4096
FC 1000

Table 2: VGG 16 Benchmarking Table

	Memory Size	# of parameter	# of MAC operations
INPUT	224 x 224 x 3 = 150K	0	0
CONV
CONV
MAXPOOL	112 x 112 x 64 = 800 K	0
CONV
CONV
MAXPOOL		0
CONV
CONV
CONV
MAXPOOL		0
CONV
CONV
CONV
MAXPOOL		0
CONV
CONV
CONV
MAXPOOL		0
FC 4096
FC 4096
FC 1000

Figure 1: The prototype of a systolic array accelerator

2 Systolic Array Architecture

The systolic array accelerator tailors for operations of DNN models. The accelerator contains a 2D array of processing elements (PE) to calculate matrix multiplication and convolution, an unified multi-bank buffer decomposed into input, weight and output buffers, and a SIMD vector unit for POOL, ACT, normalization, and etc..

This lab requires you to implement a systolic array accelerator by using verilog. Figure 1 presents the prototype of the systolic array accelerator. The specification of a systolic array accelerator is shown as follows:

The size of PE array is 16 16.
Each Processing element (PE) takes one cycle to calculate one MAC operation.
The PE array can proceed the convolution and fully-connected operation through the systolicTable 3: TinyML Model Architecture

	Input	Filter Size	# of channel	# of filter	Pool Size	stride	Activation
INPUT	16 x 16		3	3
CONV	16 x 16	2 x 2	4	16			ReLU
MAXPOOL					2 x 2	2
CONV	8 x 8	3 x 3	1	8			ReLU
FC 8

Table 4: TinyML Benchmarking Table

	cycles	Max PE utilization
INPUT	0	0
CONV
MAXPOOL		0
CONV
FC 8

execution manner.

No limit for buffer size and DRAM.
There are 16 SIMD vector lanes. Each SIMD lane can complete POOL and ACT operation inone cycle.

You need to run a TinyML model shown in the table 3 on this systolic array accelerator and complete the table 4. Note that PE utilization indicates the max number of PE used by each layer.

Finally, you need to fill your results in the TinyML Benchmark Table (See Table 4) and turn your codes and completed table in.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] IOC5009-Lab 2

1 Benchmarking Deep Neural Networks

2 Systolic Array Architecture

Reviews

Related products

[Solved] IOC5009-Lab 1