Open In App

Running Queries in Python Using Multiprocessing

Before diving into running queries using multiprocessing let’s understand what multiprocessing is in Python. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. This parallelization leads to significant speedup in tasks that involve a lot of computation. Some of you might be wondering why don’t we use this feature to our greater advantage.

Psycopg2 is the most popular PostgreSQL adapter used in  Python.  It works on the principle of the whole implementation of Python DB API 2.0 along with thread safety (the same connection is shared by multiple threads).



Running queries on Python using multiprocessing involves two important steps.

Used Database:

root

Establishing Database connection using Python




import psycopg2
  
def run():
    
    try:
        # establishing the connection
        conn = psycopg2.connect(database="root", user='root',
                                password='root', host='127.0.0.1'
                                port='5432')
          
        # Creating a cursor object using the
        # cursor() method
        cursor = conn.cursor()
          
        # Executing an query using the execute() method
        cursor.execute('''SELECT * FROM root''')
        print("Connection established to the database root")
          
        # Closing the connection
        conn.close()
    except:
        print("Connection not established to the database")
  
# calling the function
run()

Output:



Connection established to the database root

Running Queries using multiprocessing in Python




from multiprocessing.connection import Connection
import time,os
from multiprocessing import Pool, freeze_support
import psycopg2
  
def run():
    
    try:
        conn = psycopg2.connect(database="root", user='root',
                                password='root', host='127.0.0.1'
                                port='5432')
        cursor = conn.cursor()
        cursor.execute('''SELECT * FROM root''')
        records = cursor.fetchall()
        return records
    except:
        print("Connection not established to the database")
        return -1
  
if __name__=="__main__":
    
    freeze_support()
    print("Enter the number of times to run the above query")
    n=int(input())
    results = []
      
    with Pool(processes=os.cpu_count() - 1) as pool:
        
        for _ in range(n):
            res=pool.apply_async(run)
            results.append(res)
            res = [result.get() for result in results]
              
    print(res)
    pool.close()
    pool.join()

Output:

Running Queries using multiprocessing in Python

Conclusion

Sometimes you can speedup things by parallelizing them. It’s simple to practice of breaking the problem into small units so that it can be solved much faster. Similarly, these small units are distributed among different workers(processors) in order to solve them. This varies from laptop to PC because it depends on the number of cores mostly laptops have at least 4 and many have 8 whereas PCs have as many as 32. Here comes the problem Python is single-threaded by default it runs one core at a time but with the help of a multiprocessing package we can run as many cores as we want.


Article Tags :