Basic CUDA program demonstrating parallel vector addition.
This program performs element-wise addition of two vectors using CUDA. Each thread computes one element of the result vector, demonstrating the fundamental CUDA programming pattern of parallel execution.
cd vector_add
nvcc main.cpp vector_add.cu -o vector_add
./vector_add- The program generates two random vectors
- Launches a CUDA kernel where each thread computes one element:
C[i] = A[i] + B[i] - Uses 256 threads per block with appropriate grid sizing
- Demonstrates basic CUDA memory management (allocation, copying, freeing)