In this blog, I will introduce the difference between CPU and GPU to understand what pushed us to develop and use them, and introduce some CUDA vocabulary needed to follow this blog.

CPU vs GPU

The Central Processing Unit (CPU) and Graphics Processing Unit (GPU) are well-known computing engines that are present in the majority of laptops, especially those used by gamers and AI enthusiasts. While they have a lot in common as they both include cores, memory, and control units, they serve different purposes: the CPU handles the main processing functions of a system and is designed for serial operations, while the GPU excels at parallel computing.

Here are some key differences between them :

	CPU	GPU
Cores	Fewer but powerful cores	Thousands of smaller, efficient cores
Goal	Serial operations, low latency	Highly parallel tasks, throughput-intensive workloads
Memory	Large cache hierarchy (L1, L2, L3)	High bandwidth memory, shared memory
Programming	Traditional sequential programming	Parallel programming (CUDA, ...)

Why the GPUs were developed

The GPU was originally developed to handle graphics rendering; a task that is inherently parallel. Think about a screen with millions of pixels: each pixel’s color can be computed independently. Doing this one by one on a CPU would be far too slow. So, GPU architects built processors with thousands of cores that could compute all those pixels at the same time.

CUDA Vocabulary

In CUDA programming, the CPU is called the host and the GPU is called the device.