How to write Pandas DataFrame to PostgreSQL table?

In this article, we will be looking at some methods to write Pandas dataframes to PostgreSQL tables in the Python.

Method 1: Using to_sql() function

to_sql function is used to write the given dataframe to a SQL database.

Syntax

df.to_sql(‘data’, con=conn, if_exists=’replace’, index=False)

Parameters :

data: name of the table.
con: connection to the database.
if_exists: if table exists or not. “replace” or “append”.
index: True or False.

Example:

In the example demonstrated below, we import the required packages and modules, establish a connection to the PostgreSQL database and convert the dataframe to PostgreSQL table by using the to_sql() method. Finally, all rows are fetched using fetchall() method.

To access the CSV file used click here.

Python3

# import packages 

import psycopg2 

import pandas as pd 

from sqlalchemy import create_engine 

# establish connections 

conn_string = 'postgres://postgres:pass@127.0.0.1/Airlines_Database'

db = create_engine(conn_string) 

conn = db.connect() 

conn1 = psycopg2.connect( 

    database="Airlines_Database", 

  user='postgres',  

  password='pass',  

  host='127.0.0.1',  

  port= '5432'
) 

conn1.autocommit = True

cursor = conn1.cursor() 

# drop table if it already exists 

cursor.execute('drop table if exists airlines_final') 

sql = '''CREATE TABLE airlines_final(id int , 
day int ,airline char(20),destination char(20));'''

cursor.execute(sql) 

# import the csv file to create a dataframe 

data = pd.read_csv("airlines_final.csv") 

data = data[["id","day","airline","destination"]] 
# Create DataFrame 

print(data) 

# converting data to sql 

data.to_sql('airlines_final', conn, if_exists= 'replace') 

# fetching all rows 

sql1='''select * from airlines_final;'''
cursor.execute(sql1) 

for i in cursor.fetchall(): 

    print(i) 

conn1.commit() 
conn1.close()

Output:

Method 2: Using execute_values() function

The execute_values() function from the psycopg2 library is used to get the postgres table of the given data frame.

Syntax:

psycopg2.extras.execute_values(cur, sql, argslist, template=None, page_size=100, fetch=False)

Parameters:

cur – the cursor that will be used to run the query.
sql – the query that will be run. A single percent s placeholder must be present, which will be substituted by a VALUES list.
argslist — a list of series or dictionaries containing the query’s arguments. The template must be followed in terms of font and content.
template – the snippet that will be merged into each item in the argslist to form the query
page size – the maximum amount of argslist items that each statement can have.
fetch – it’s similar to fetchall. the values can be “True” or “False”

Example:

In this example, the same CSV file is used in this method. code begins with importing packages, then we form a custom function execute_values, where the given dataframe, connection, and table name are given as arguments. The dataframe rows and values are updated into the PostgreSQL table using the execute_values() method. The defined method contains an exception handling block, if there’s no exception “execute_values() done” is printed.

Python3

# import packages 

import psycopg2 

import psycopg2.extras as extras 

import pandas as pd 

def execute_values(conn, df, table): 

    tuples = [tuple(x) for x in df.to_numpy()] 

    cols = ','.join(list(df.columns)) 

    # SQL query to execute 

    query = "INSERT INTO %s(%s) VALUES %%s" % (table, cols) 

    cursor = conn.cursor() 

    try: 

        extras.execute_values(cursor, query, tuples) 

        conn.commit() 

    except (Exception, psycopg2.DatabaseError) as error: 

        print("Error: %s" % error) 

        conn.rollback() 

        cursor.close() 

        return 1

    print("execute_values() done") 

    cursor.close() 

# establishing connection 

conn = psycopg2.connect( 

    database="Airlines_Database", 

    user='postgres', 

    password='sherlockedisi', 

    host='127.0.0.1', 

    port='5432'
) 

sql = '''CREATE TABLE airlines_final1(id int ,day 
char(20) ,airline char(20),destination char(20));'''

# creating a cursor 

cursor = conn.cursor() 
cursor.execute(sql) 

data = pd.read_csv("airlines_final.csv") 

data = data[["id", "day", "airline", "destination"]] 

# using the function defined 

execute_values(conn, data, 'airlines_final1')

Output:

execute_values() done

Article Tags :

Python

Python pandas-dataFrame

Python PostgreSQL

Python-pandas