Open In App

Prefetch_related and select_related functions in django

 In Django, select_related and prefetch_related are designed to stop the deluge of database queries that are caused by accessing related objects. In this article, we will see how it reduces the number of queries and make the program much faster. 

One uses select_related when the object that you’re going to be selecting is a single object, so OneToOneField or a ForeignKey. You use prefetch_related when you’re going to get a “set” of things, so ManyToManyFields as you stated or reverse ForeignKeys. Just to clarify what I mean by “reverse ForeignKeys”. 



Example to illustrate the concept of Prefetch_related and select_related – 

The above classification might be not so clear let’s see an example:






class ModelA(models.Model):
   pass
 
class ModelB(models.Model):
   a = models.ForeignKey(ModelA, on_delete=models.CASCADE)
 
 
# Forward ForeignKey relationship
ModelB.objects.select_related('a').all()
 
# Reverse ForeignKey relationship
ModelA.objects.prefetch_related('modelb_set').all()

select_related obtains all data at one time through multi-table join Association query and improves performance by reducing the number of database queries. It uses JOIN statements of SQL to optimize and improve performance by reducing the number of SQL queries. The latter is to solve the problem in the SQL query through a JOIN statement. However, for many-to-many relationships, it is not wise to use SQL statements to solve them, because the tables obtained by JOIN will be very long, which will lead to an increase in running time and memory occupation of SQL statements. The solution to prefetch_related() is to query each table separately and then use Python to handle their relationship!

Here are some examples :

Models.py reads as follows: 




from django.db import models
 
class Province(models.Model):
   name = models.CharField(max_length = 10)
   def __unicode__(self):
       return self.name
 
class City(models.Model):
   name = models.CharField(max_length = 5)
   province = models.ForeignKey(Province)
   def __unicode__(self):
       return self.name
 
class Person(models.Model):
   firstname = models.CharField(max_length = 10)
   lastname = models.CharField(max_length = 10)
   visitation = models.ManyToManyField(City, related_name = "visitor")
   hometown = models.ForeignKey(City, related_name = "birth")
   living = models.ForeignKey(City, related_name = "citizen")
   def __unicode__(self):
       return self.firstname + self.lastname

select_related – 

 we use the select_related() function:

>>> citys = City.objects.select_related().all()
>>> for c in citys:
...   print c.province
...

There is only one SQL query, which obviously greatly reduces the number of SQL queries:

SELECT `Optimize_city`.`id`, `Optimize_city`.`name`,
`Optimize_city`.`province_id`, `Optimize_province`.`id`, `Optimize_province`.`name`
FROM`Optimize_city`
INNER JOIN `Optimize_province` ON
(`Optimize_city`.`province_id`=`Optimize_province`.`id`);

prefetch_related – 

Here we can see that Django uses INNER JOIN. I would like to clear one thing that Optimize is a name of our app. If we want to get all the city names of Hubei, we can do this:

> hb=Province.objects.prefetch_related('city_set').get(name__iexact=u"Hubei Province")
>>> for city in hb.city_set.all():
...   city.name
...

Triggered SQL queries:

SELECT `Optimize_province`.`id`, `Optimize_province`.`name`
FROM `Optimize_province`
WHERE `Optimize_province', `name `LIKE'Hubei Province';

SELECT `Optimize_city`.`id`, `Optimize_city`.`name`, `Optimize_city`.`province_id`
FROM `Optimize_city`
WHERE `Optimize_city`.`province_id` IN (1);

As we can see, prefetch is implemented using the IN statement. In this way, when there are too many objects in QuerySet, performance problems may arise depending on the characteristics of the database.


Article Tags :