Open In App

Understanding ‘fork’ and ‘spawn’ in Python Multiprocessing

Last Updated : 06 Dec, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Python’s multiprocessing library provides a powerful way to leverage multiple processor cores for concurrent execution, enhancing the performance of computationally intensive tasks. One of the intriguing aspects of multiprocessing is the ability to initiate new processes using various start methods. These start methods not only facilitate the creation of child processes but also offer control mechanisms for managing the flow of execution between parent and child processes.

Start Methods

A start method is a technique to start the child process in Python, There are 3 start methods

  • Fork: creates a copy from the existing process (resources used or altered by a parent will be referred by the child)
  • Spawn: creates the new process
  • Forkserver: new process from which future forked processes will be copied. (provides isolation and efficiency)

You can identify which technique was used as the start method by the multiprocessing module in Python by executing the following command:

Python3




import multiprocessing
print(multiprocessing.get_start_method())


Output

fork





You can set the desired start method type by following the snippet:

Python3




import multiprocessing
multiprocessing.set_start_method('spawn')


The type of start method used depends on the operating system, and some operating systems might not support certain start methods.

The following lists major OS Platforms and default start methods

  • Windows : spawn
  • macOS : spawn
  • Linux : Fork

As of now we focus on fork and spawn method

Fork

This fork method works on creating the exact clone or copy of the parent process. i.e resources utilised or altered by the parent process will be accessed by the child.

Screenshot-from-2023-08-19-11-35-42

For Example:

If you declare a global variable num and set its value to 10, then modify the value of this variable within the parent process before creating a child process. The child process after being forked, will inherit the updated value of num from the parent. In simpler terms, the child process receives a copy of the existing process, including any changes made on the resources in the fork method.

Explanation

The script initializes “num” as 10 globally, accessible to both the main and child processes. Main process updates it to 20, but child process inherits the modified value 20. The child process prints 20, increments it to 21. In the script showcased, the ‘fork’ method is used to create child processes that inherit a snapshot of the parent process’s memory, including variable values. However, changes made to shared variables within child processes don’t impact the parent process.

Python3




import multiprocessing
 
# initialize the value
num = 10
def childprocess():
   
    # refer to the global variable
    global num 
    print(f"In child process before update: {num}")
 
    #updating num value
    num+= 1
 
    print(f"In child process after update: {num}")
 
 
def mainprocess():
 
    # refer to the global variable
    global num
    print(f"In parent process before update {num}")
 
    #updating num value
    num = 20
 
    # execution logic
    print(f"In parent process after update: {num}")
    process = multiprocessing.Process(target = childprocess)
    process.start()
    process.join()
    print(f"At the end the vaule is: {num}")
 
if __name__ == '__main__':
   
    # setting start method as fork
    multiprocessing.set_start_method('fork')
    print(multiprocessing.get_start_method())
    mainprocess()


Output

fork
In parent process before update 10
In parent process after update: 20
In child process before update: 20
In child process after update: 21
At the end the vaule is: 20


Spawn

This spawn method works on creating new process from the parent process. i.e resources utilised or altered by the parent process will not get reflected in child process as we seen in fork mechanism.

Screenshot-from-2023-08-19-11-47-38

spawn start method

For Example:

If you declare a global variable num and set its value to 10, then modify the value of this variable within the parent process before creating a child process, the child process after being spawned will get access the value num as 10 and not the parent changed value 20. Means the spawn will tend to create a new fresh process and will not inherit the changes from parent.

Explanation

The script initializes “num” as 10 globally, accessible to both the main and child processes. Main process updates it to 20, but child process inherits the original 10. The child process prints 10, increments it to 11. In the script showcased, the ‘spawn’ method is employed to generate child processes, each equipped with an entirely fresh memory space. This method ensures that changes made to shared variables within child processes remain contained and do not affect the parent process.

Python3




import multiprocessing
 
# initialize the value
num = 10
def childprocess():
   
    # refer to the global variable
    global num 
    print(f"In child process before update: {num}")
 
    #updating num value
    num+= 1
 
    print(f"In child process after update: {num}")
 
 
def mainprocess():
     
    # refer to the global variable
    global num
    print(f"In parent process before update {num}")
 
    #updating num value
    num = 20
 
    # execution logic
    print(f"In parent process after update: {num}")
    process = multiprocessing.Process(target = childprocess)
    process.start()
    process.join()
    print(f"At the end the vaule is: {num}")
 
if __name__ == '__main__':
   
    # setting start method as spawn
    multiprocessing.set_start_method('spawn')
    print(multiprocessing.get_start_method())
    mainprocess()


Output

spawn
In parent process before update 10
In parent process after update: 20
In child process before update: 10
In child process after update: 11
At the end the vaule is: 20





Difference between Fork and Spawn methods.

Fork

Spawn

  • It is the default method used on Unix-like systems.
  • It creates child processes that inherit the entire memory space of the parent process, which can be memory-intensive.
  • Best suited for situations where you need to share a lot of data and state between the parent and child processes efficiently.
  • Use cases include parallelizing tasks with shared data or implementing multiprocessing on Unix-based systems.
  • It is the default method on Windows and is also available on Unix-like systems.
  • It creates child processes with a clean memory space, reducing the risk of memory-related issues and providing more isolation.
  • Suitable for scenarios where you want to avoid interference between parent and child processes and ensure safety and stability.
  • Use cases include running untrusted code, ensuring process isolation, and handling multiprocessing in a more controlled and secure manner.

Summary

  • Fork and spawn are entirely different mechanism for creating the new process
  • Fork inherits the changes done by the parent
  • Spawn will not inherit the changes done by the parent
  • In both fork and spawn changes made by the child process will not get reflected in parent. This is because the parent and child processes are distinct and independent entities after being spawned or forked


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads