Running Queries in Python Using Multiprocessing
Last Updated :
21 Nov, 2022
Before diving into running queries using multiprocessing let’s understand what multiprocessing is in Python. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. This parallelization leads to significant speedup in tasks that involve a lot of computation. Some of you might be wondering why don’t we use this feature to our greater advantage.
Psycopg2 is the most popular PostgreSQL adapter used in Python. It works on the principle of the whole implementation of Python DB API 2.0 along with thread safety (the same connection is shared by multiple threads).
Running queries on Python using multiprocessing involves two important steps.
- In Order to run queries using Python first establish database connection using Psycopg2.
- After establishing a connection implement a multiprocessing module which helps in completing the task in less time
Used Database:
root
Establishing Database connection using Python
Python3
import psycopg2
def run():
try :
conn = psycopg2.connect(database = "root" , user = 'root' ,
password = 'root' , host = '127.0.0.1' ,
port = '5432' )
cursor = conn.cursor()
cursor.execute( )
print ( "Connection established to the database root" )
conn.close()
except :
print ( "Connection not established to the database" )
run()
|
Output:
Connection established to the database root
Running Queries using multiprocessing in Python
Python3
from multiprocessing.connection import Connection
import time,os
from multiprocessing import Pool, freeze_support
import psycopg2
def run():
try :
conn = psycopg2.connect(database = "root" , user = 'root' ,
password = 'root' , host = '127.0.0.1' ,
port = '5432' )
cursor = conn.cursor()
cursor.execute( )
records = cursor.fetchall()
return records
except :
print ( "Connection not established to the database" )
return - 1
if __name__ = = "__main__" :
freeze_support()
print ( "Enter the number of times to run the above query" )
n = int ( input ())
results = []
with Pool(processes = os.cpu_count() - 1 ) as pool:
for _ in range (n):
res = pool.apply_async(run)
results.append(res)
res = [result.get() for result in results]
print (res)
pool.close()
pool.join()
|
Output:
Running Queries using multiprocessing in Python
Conclusion
Sometimes you can speedup things by parallelizing them. It’s simple to practice of breaking the problem into small units so that it can be solved much faster. Similarly, these small units are distributed among different workers(processors) in order to solve them. This varies from laptop to PC because it depends on the number of cores mostly laptops have at least 4 and many have 8 whereas PCs have as many as 32. Here comes the problem Python is single-threaded by default it runs one core at a time but with the help of a multiprocessing package we can run as many cores as we want.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...