Create custom datatypes using Pydantic module in Python
Last Updated :
16 Aug, 2021
Many times, we find that we need to pass in a long list of variables to a function, and specifying all of that in the Python function signature can be a bit messy. Again, problems also arise when you want some kind of validation to variables passed. For a long list of variables, it is really difficult to keep validating the data inside the main function body, also it is not a good practice. In that scenario, what you want to do is separate and segregate your variables into different classes. Here, we are going to demonstrate how can use pydantic to create models along with your custom validations. First, let’s discuss the use case.
Consider, we are receiving some data from an API call, and we need to do some kind of analysis on it. Typically, an API response will send the response in form of JSON, so we would want our models to be able to serialize and deserialize JSON (1).
Also, we would assume types of certain variables. For example, if we are passing an address, we would assume the pincode to be an Integer value. This is type checking (2).
To perform analysis, you would make some assumptions on the data, like say, pincode should match with the district name provided. This is validation (3).
We might also assume that for certain fields like states, it should be within a list of states say in India, and not any random arbitrary value. This falls under cleaning (4).
So, with these four requirements, let’s start coding out mode. I would assume that you have python installed on your system. To, install pydantic simply run,
pip install pydantic
With that set, create a file called models.py and paste the below code in it. We have added detailed in-line comments in the code itself to make it easier to understand directly.
Python3
from enum import Enum
from typing import Optional
from pydantic import BaseModel, PositiveInt, validator, root_validator, constr
class StateTypes( str , Enum):
DELHI = "DLH"
UTTAR_PRADESH = "UP"
BENGALURU = "BLR"
WEST_BENGAL = "WB"
class PersonalDetails(BaseModel):
id : int
name: constr(min_length = 2 , max_length = 15
phone: PositiveInt
@validator ( "phone" )
def phone_length( cls , v):
if len ( str (v)) ! = 10 :
raise ValueError( "Phone number must be of ten digits" )
return v
class Address(BaseModel):
id : int
address_line_1: constr(max_length = 50 )
address_line_2: Optional[constr(max_length = 50 )] = None
pincode: PositiveInt
city: constr(max_length = 30 )
state: StateTypes
@validator ( "pincode" )
def pincode_length( cls , v):
if len ( str (v)) ! = 6 :
raise ValueError( "Pincode must be of six digits" )
return v
class User(BaseModel):
personal_details: PersonalDetails
address: Address
@root_validator (skip_on_failure = True )
def check_id( cls , values):
personal_details: PersonalDetails = values.get( "personal_details" )
address: Address = values.get( "address" )
if personal_details. id ! = address. id :
raise ValueError(
"ID field of both personal_details as well as address should match"
)
return values
if __name__ = = "__main__" :
validated_data = {
"personal_details" : {
"id" : 1 ,
"name" : "GeeksforGeeks" ,
"phone" : 9999999999 ,
},
"address" : {
"id" : 1 ,
"address_line_1" : "Sector- 136" ,
"pincode" : 201305 ,
"city" : "Noida" ,
"state" : "UP" ,
},
}
user = User( * * validated_data)
print (user)
unvalidated_data = {
"personal_details" : {
"id" : 1 ,
"name" : "GeeksforGeeks" ,
"phone" : 9999999999 ,
},
"address" : {
"id" : 2 ,
"address_line_1" : "Sector- 136" ,
"pincode" : 201305 ,
"city" : "Noida" ,
"state" : "UP" ,
},
}
user = User( * * unvalidated_data)
print (user)
|
Output:
Execution Screenshot
Upon running this, the first print statement will get executed successfully but in the next initialization of the User model, it would throw ValidationError of type ValueError because IDs of both personal details and address does not match.
Share your thoughts in the comments
Please Login to comment...