In Python, as In most programming languages, code executes sequentially, meaning that every function waits for the previous function to complete before it can execute. That sounds great in theory, however, it can be a bottleneck for many scenarios in the real world.
For example, let’s consider a web app that displays images of dogs from multiple sources. The user can view and then select as many images as they want to download. In terms of code, it would look something like this:
for image in images: download(image)
This seems pretty straightforward, right? Users choose a list of images that they download in sequential order. If each image takes about 2 seconds to download and there are 5 images, the wait time approximately is 10 seconds. When an image is downloading, that’s all your program does: just wait for it to download.
But it’s unlikely that’s the only process the user’s computer is running. They might be listening to songs, editing a picture, or playing a game. All of this seems to be happening simultaneously, but in fact the computer is switching rapidly between each task. Basically, every process that a computer executes is split up into chunks that the CPU evaluates, queues, and decides when to process. There are different orders of processing that a CPU can use (but let’s leave that for another article) in order to optimally process each chunk so fast that it looks like the computer is performing multiple tasks simultaneously. However, it actually happens concurrently.
Let’s get back to our dog images example. Now that we know the user’s computer can handle multiple tasks at once, how can we speed up the download? Well, we can tell the CPU that each download of an image can happen concurrently, and that one image does not have to wait for another to complete. This allows each image to be downloaded in a separate “thread.”
A thread is simply a separate flow of execution. Threading is the process of splitting up the main program into multiple threads that a processor can execute concurrently.
Multithreading Vs Multiprocessing
By design, Python is a linear language. It does not take advantage of multiple CPU cores or the GPU by default, but it can be tweaked to do so. The first step is to understand the difference between multithreading and multiprocessing. A simple way to do this is to associate I/O-bound tasks with multithreading (e.g., tasks like disk read/writes, API calls and networking that are limited by the I/O subsystem), and associate CPU-bound tasks with multiprocessing (e.g., tasks like image processing or data analytics that are limited by the CPU’s speed).
While I/O tasks are processing, the CPU is sitting idle. Threading makes use of this idle time in order to process other tasks. Keep in mind that threads created from a single process also share the same memory and locks.
Python is not thread-safe, and was originally designed with something called the GIL, or Global Interpreter Lock, that ensures processes are executed serially on a computer’s CPU. On the surface, this means Python programs cannot support multiprocessing. However, since Python was invented, CPUs (and GPUs) have been created that have multiple cores. Today’s parallel programs take advantage of these multiple cores to run multiple processes concurrently:
- With multithreading, different threads use different processors, but each thread still executes serially.
- With multiprocessing, each process gets its own memory and processing power, but the processes cannot communicate with each other.
Now back to the GIL. By taking advantage of modern multi-core processors, programs written in Python can also support multiprocessing and multithreading. However, the GIL will still ensure that only one Python thread gets run at a time. So in summary, when programming in Python:
- Use multithreading when you know the program will be waiting around for some external event (i.e., for I/O-bound tasks).
- Use multiprocessing when your code can safely use multiple cores and manage memory (i.e., for CPU-bound tasks).