- Navigate to folder of respective model
- Change desired batch size in
utils.h - A. run using
$ bash run.bash
- B. alternitavely: set the path to the CUDA Runtime Library of your system in
Makefile
- currently commented out: the path on my personal PC on line 5, and the path on the server on line 7
- and run using
$ make
$ ./fashion_prof.o <OR ./cifar.o>
L1 Conv → L2 Maxpool → L3 Step → L4 Conv → L5 Maxpool → L6 Step → L7 Flattening → L8 Gemm → L9 Step → L10 Gemm
- Layer 1, 2, 4, 5, 8, 10 on GPU using PROFILE XYZ
- Layer 3, 6, 7, 9 on CPU.
L1 Conv → L2 Step → L3 Conv → L4 Maxpool → L5 Step → L6 Conv → L7 Step → L8 Conv → L9 Maxpool → L10 Step → L11 Conv → L12 Step → L13 Conv → L14 Maxpool → L15 Step → L16 Flattening → L17 Gemm → L18 Step → L19 Gemm
- Layer 1, 3, 4, 6, 8, 9, 11, 13, 14, 17, 19 on GPU using PROFILE XYZ
- Layer 2, 5, 7, 10, 12, 15, 16, 18 on CPU.
in net.cpp (X layer number):
- comment
/* Layer X GPU */lines - comment out immediate
/* Layer X CPU */lines
- Revert previous steps in case they were performed
- Do NOT comment out the STEP layers (i.e. the layers running on CPU by default)
- PROFILE X – data-images
- PROFILE Y – windows
- PROFILE Z – neurons
- PROFILE XY – data-images + windows
- PROFILE XZ – data-images + neurons
- PROFILE YZ – windows + neurons
- PROFILE XYZ – data-images + windows + neurons
in cuda_kernel.cu:
- Each layer is specifically implemented for every profile iteratively
- Each profile can be found between delimiting lines:
============= - Search for desired profile using CTRL+F, comment out all lines of coude between delimiting lines
- Don't forget to comment-in previously used profiles (e.g. default profile XYZ at the bottom of the file)