Working Around Memory Leaks in Your Django Application

ShahNilay
6 min readApr 19, 2022

What is memory leak?

A memory leak is like a virtual oil leak in your computer. It slowly drains the available memory, reducing the amount of free memory the system can use. Most memory leaks are caused by a program that unintentionally uses up increasing amounts of memory while it is running. This is typically a gradual process that gets worse as the program remains open. If the leak is bad enough, it can cause the program to crash or even make the whole computer freeze.

Django is built using python programming language and we have the upper hand as python supports inbuilt garbage collectors (unlike c and c++ where a developer needs to define malloc functions manually), but still, there are some edge-cases exist in which we are needed to be handled separately!

Basic understanding of a concept before jumping to memory leaks:

Tools used for research memory leaks in the application.

The above 3 packages are mainly used for our use case. Some other useful alternates are:

Local setup installation (Ubuntu system):

install below libraries/packages on your local machine:

sudo apt-get install python3-tk
pip install -U memory_profiler
pip install matplotlib

add @profile decorator (from memory_profile import profile) above function which you want to analyze and then run:

mprof run --multiprocess --python python manage.py runserver --noreload

to plot graph run:

mprof plot

Research:

First, let’s dive into some basics of os memory management concepts. During runtime execution of a program, variables memory are handled majorly using heap and stack. And as mentioned in the some of python documentation:

Work of Stack Memory

The allocation happens on contiguous blocks of memory. We call it to stack memory allocation because the allocation happens in the function call stack. The size of memory to be allocated is known to the compiler and whenever a function is called, its variables get memory allocated on the stack.

It is the memory that is only needed inside a particular function or method call. When a function is called, it is added to the program’s call stack. Any local memory assignments such as variable initialization inside the particular functions are stored temporarily on the function call stack, where it is deleted once the function returns, and the call stack moves on to the next task. This allocation onto a contiguous block of memory is handled by the compiler using predefined routines, and developers do not need to worry about it.

def func(): 
# All these variables get memory
# allocated on stack
a = 20
b = []
c = ""

Work of Heap Memory

The variables are needed outside of method or function calls or are shared within multiple functions globally and are stored in Heap memory.

We will analyze each API using this decorator and try to find whether there are any variables left inside a function that isn’t taken into use further.

# This memory for 10 integers  
# is allocated on heap.
a = [0]*10

So, during object creation, variable references are stored in the stack and actual variable values are stored in Heap memory. Reference value allocation is been handled by python itself so programmers do not need to worry about allocations and de-allocations of variable references, but they sometimes need to take care of heap memory allocation, if don’t then might lead to memory leak issues in near future.

Default Garbage collection handled by python example:

from memory_profiler import profile
import gc

@profile
def parent():
@profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
return a

my_func()

if __name__ == "__main__":
parent()

Profiler Results:

Line  Mem usage    Increment  Occurrences  Line Contents
4 17.6 MiB 17.6 MiB 1 @profile
5 def parent():
6 17.6 MiB 0.0 MiB 2 @profile
7 17.6 MiB 0.0 MiB 1 def my_func():
8 25.1 MiB 7.5 MiB 1 a = [1]*(10**6)
9 177.7 MiB 152.6 MiB 1 b = [2]*(2*10**7)
10 177.7 MiB 0.0 MiB 1 return a
11
12 17.6 MiB -160.1 MiB 1 my_func()

Understanding of what happened here: One can see inside function an array object is created which consumes a larger amount of memory, but after returning of function all local variables are removed by python runtime interpreter (both a and b variables (7.5 + 152.6 = 160.1) memory are de-allocated.

Till now we have seen some examples and we couldn’t find anything suspicious. So the next question that arises can be where’s the issue, why much larger application faces memory leak problem if python supports inbuilt garbage collectors (of course, unlike c, CPP it’s clear that we don’t need to worry about manual malloc and calloc functions)!

How can memory leaks happen?

One word answer: Reference cycle. Let’s try to understand via a simple example.

from memory_profiler import profile@profile
def parent():
@profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
a.append(a) // additional code change
b.append(b) // additional code change
return a
my_func()if __name__ == "__main__":
parent()

Profiler results:

Line  Mem usage   Increment  Occurrences  Line Contents
4 17.6 MiB 17.6 MiB 1 @profile
5 def parent():
6 17.6 MiB 0.0 MiB 2 @profile
7 17.6 MiB 0.0 MiB 1 def my_func():
8 25.1 MiB 7.5 MiB 1 a = [1] * (10**6)
9 177.7 MiB 152.6 MiB 1 b = [2]*(2*10**7)
10 177.8 MiB 0.1 MiB 1 a.append(a)
11 177.8 MiB 0.0 MiB 1 b.append(b)
12 177.8 MiB 0.0 MiB 1 return a
13
14 177.8 MiB 0.0 MiB 1 my_func()

Because my_func() creates an object x which refers to itself, the object x will not automatically be freed when the function returns. This will cause the memory that x is using to be held onto until the Python garbage collector is invoked.

What’s the solution for this?

Apply manual garbage collectors.

from memory_profiler import profile
import gc
@profile
def parent():
@profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
a.append(a)
b.append(b)
return a
@profile // new changes
def cleaner(): // new changes
print(gc.collect()) // new changes
my_func()
cleaner() // new changes
if __name__ == "__main__":
parent()
gc.collect()

Profiler Results:

Line  Mem usage    Increment  Occurrences  Line Contents
4 17.6 MiB 17.6 MiB 1 @profile
5 def parent():
6 17.6 MiB 0.0 MiB 2 @profile
7 17.6 MiB 0.0 MiB 1 def my_func():
8 25.1 MiB 7.5 MiB 1 a = [1] * (10 ** 6)
9 177.7 MiB 152.6 MiB 1 b = [2]*(2*10**7)
10 177.8 MiB 0.1 MiB 1 a.append(a)
11 177.8 MiB 0.0 MiB 1 b.append(b)
12 177.8 MiB 0.0 MiB 1 return a
13
14 177.8 MiB 0.0 MiB 2 @profile
15 17.6 MiB 0.0 MiB 1 def cleaner():
16 17.6 MiB -160.2 MiB 1 print(gc.collect())
17
18 177.8 MiB 0.0 MiB 1 my_func()
19 17.6 MiB 0.0 MiB 1 cleaner()

As you can see manual garbage collector call will free up the space which was supposed to be handled during returning of my_func() but due to the reference cycle assigned memory wasn’t been released.

How to enable these changes to your django project:

Go to WSGIHandler class and add a new function

@profile
def get_custom_response(self, request):
gc.collect()
response = self.get_response(request)
gc.collect()
return response

and inside __call__ function replace this line:

response = self.get_custom_response(request)

Now we will be able to analyze memory usage for any API for this project.

Conclusion

Avoid reference cycles and run garbage collectors periodically to avoid memory leaks.

--

--