Django (SQL/”Model”) Performance Optimization

Django makes it easy to write code, which performs bad, since it is automatically executing SQL statements when you are handling models. If you don’t know about the details behind that, you can easily end up executing thousands of queries.

def get_books_by_author():
    books = Book.objects.all()
    result = defaultdict(list)    
    for book in books:
        author = book.author
        title_and_author = '{} by {}'.format(
            book.title,
            author.name
        )
        result[book.library_id].append(title_and_author)    return result

This will execute for each book an author query! Which means 1 query for books and n queries for n books!

books = Book.objects.all().select_related('author')

This will add “author” into the Book.objects.all() query and you just execute one query!

However, select_related only works for one-to-many and one-to-one. Many to one relations can be optimized with prefetch_related().

You can also limit the values ob the single object returned (like the fields inside a SELECT in SQL) with .values() or .values_list(). However, the disadvantage is that you operate now on dictionaries instead of the models like described in the source code.

bulk_create() will insert the models in the array in one query instead of a single query in a loop for each model.

Use SQL instead of Python – it is generally faster. .annotate() will help you with that.

libraries = (
        Library.objects
        .all()
        .annotate(page_count=Sum('books__pages'))
        .values_list('id', 'page_count')
    )    for library_id, page_count in libraries:
        result[library_id] = page_count

I recommend you take a look into the Django documentation for QuerySets.

Source: https://medium.com/@ryleysill93/basic-performance-optimization-in-django-ebd19089a33f