Open In App

SQLAlchemy: How to group by two fields and filter by date

Last Updated : 28 Mar, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to group records by two fields and filter by date using SQLAlchemy in Python.

Since we are going to use MySQL in this article, we will also install a SQL connector for MySQL in Python. However, none of the code implementations changes with change in the database except for the SQL connectors.

pip install pymysql

We will use the sample sakila database from MySQL. In this, we will extract a few fields from the payment table within the sakila database by grouping them using `customer_id` and `rental_id` and filtering the query on the  `payment_date` field. If you do not have the sakila database and want to follow along with this article without installing it then use the SQL script present in the link mentioned below to create the required schema and payment table along with the records.

Sakila Payment Table Script

For reference, the first ten records in the payment table look like –

The SQL query which we are looking at in the below two examples is –

SELECT 
    customer_id,
    rental_id,
    amount,
    payment_date
FROM payment 
WHERE payment_date > '2005-05-25'
GROUP BY customer_id, rental_id;

Method 1: Using SQLAlchemy Core to group by two fields and filter by date:

SQLAlchemy Core is a schema-centric model that means everything is treated as a part of the database i.e., rows, columns, tables, etc. In the below example, we have created the metadata object to access the database objects like the table. Using this object we get the metadata of the `payment` table. This metadata information is then used to query to the table using the SQLAlchemy syntax mentioned below.

Syntax: sqlalchemy.engine.base.Engine(sqlalchemy.sql.selectable.Select(*entities).filter(*criterion).group_by(*clauses)).all()

Python3




import sqlalchemy as db
  
# DEFINE THE ENGINE (CONNECTION OBJECT)
engine = db.create_engine("mysql+pymysql://root:password@localhost/sakila")
  
# CREATE THE METADATA OBJECT TO ACCESS THE TABLE
meta_data = db.MetaData(bind=engine)
db.MetaData.reflect(meta_data)
  
# GET THE `payment` TABLE FROM THE METADATA OBJECT
payment = meta_data.tables['payment']
  
# PREPARING QUERY
query = db.select(
    payment.c.customer_id,
    payment.c.rental_id,
    payment.c.amount,
    payment.c.payment_date
).filter(payment.c.payment_date > '2005-05-25') \
    .group_by(payment.c.customer_id, payment.c.rental_id)
  
# EXTRACT THE RECORDS USING THE QUERY AND ENGINE
with engine.connect() as conn:
    result = conn.execute(query).all()
  
# PRINT FIRST 10 RECORDS
for i in range(10):
    r = result[i]
    print(r.customer_id, "|", r.rental_id, "|", r.amount, "|", r.payment_date)


Output:

Method 2: Using SQLAlchemy ORM to group by two fields and filter by date:

SQLAlchemy Core uses an object-centric view that encapsulates the schema with business objects. It is a more pythonic implementation as we can see the tables represented in the class format. We have used this class object to query the payment table using the SQLAlchemy syntax mentioned below.

Syntax: sqlalchemy.orm.Query(*entities).filter(*criterion).group_by(*clauses).all()

Python3




from sqlalchemy.orm import sessionmaker
import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
  
Base = declarative_base()
  
# DEFINE THE ENGINE (CONNECTIO OBJECT)
engine = db.create_engine("mysql+pymysql://root:password@localhost/sakila")
  
# AUTOMATICALLY MAP THE TABLE MODEL TO USE IT FOR QUERYING
class Payment(Base):
  
    __table__ = db.Table("payment", Base.metadata, autoload_with=engine)
  
# CREATE A SESSION OBJECT TO INITIATE QUERY IN DATABASE
Session = sessionmaker(bind=engine)
session = Session()
  
# PREPARING QUERY USING SQLALCHEMY
result = session.query(
    Payment.customer_id,
    Payment.rental_id,
    Payment.amount,
    Payment.payment_date
).filter(Payment.payment_date > '2005-05-25') \
    .group_by(Payment.customer_id, Payment.rental_id) \
    .all()
  
# PRINT FIRST 10 RECORDS
for i in range(10):
    r = result[i]
    print(r.customer_id, "|", r.rental_id, "|", r.amount, "|", r.payment_date)


Output:



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads