How to hide sensitive credentials using Python
Have you ever been in a situation where you are working on a python project need to share your code with someone or you are hosting your code in a public repository but don’t want to share the sensitive credentials so it isn’t exploited by a random user?
For example, you are making a web app in Django, where there is a concept of ‘SECRET_KEY’ which is a randomly generated unique key and is used for cryptographic signing of data. As the name suggests it should not be publicly shared as it defeats many of Django’s security protections. Or maybe you are using cloud storage say AWS S3, you will need to store the access token in the code and also prevent unauthorized users to misuse the credentials, how can we do both? For such cases, we need to prevent hardcoding of the ‘key’ (essentially the variables holding our credentials) into our code and subsequently not exposing it in our public repository.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course
Method 1: Import from another Python file
One of the easiest and basic methods is to save the credentials in another python file say secrets.py and import it into the required file. We need to .gitignore the secrets.py file. Now we can store the credentials in essentially two ways, the first being to use python variables to store the values and the second more preferred way is to use a dictionary. Dictionary is preferred because if we try to access a non-existent variable, it will raise an error but in the case of the dictionary, we can return a default value.
We have saved the following credentials in a dictionary in secrets.py:
Now import the credentials in the required file, main.py.
This works and we don’t need to worry about data type conversion of boolean and integer values(you will understand why this is important in the later methods) but isn’t the recommended approach because the file name and dictionary name can vary for different projects so it doesn’t form a standard solution. More importantly, this approach is restricted to python as in a more realistic scenario we could be working with multiple languages which also require access to the same credentials, and storing them in a way that is only accessible to one language isn’t ideal. A better approach is to use environment variables.
Method 2: Using Environment variables
We can store the credentials as environment variables. Environment variables are essentially key-value pairs that are set using the functionality of the operating system and can be used by any programming language since they are linked to the environment or operating system. Since we are setting credentials as environment variables we aren’t exposing them in our code, so if someone else has the access to our code the credentials wouldn’t be set in their environment. Also, we can set different values for production and local environments like using a different mailing service while in development and we don’t need to worry about changing the code.
Many hosting providers like Heroku, netlify, etc. provide an easy way to set the environment variables. In python we can access the environment variables using os.environ and it works very similar to a normal python dictionary. os.environ returns string values and we need to manually typecast every value. Assuming we have set the same credentials mentioned above as environment variables.
Also, environment variables set locally during development are persistent only for a session so we need to manually set them every time before we run our project. To make the process simpler we have an awesome package called python-decouple. The package helps in loading the credentials both from the environment or from an external .env or .ini file.
We store the credentials in a .env or settings.ini file and gitignore them. python-decouple searches for options in this order :
- Environment variables.
- ini or .env file.
- default value passed during the call.
If the environment variable is already set it returns the same otherwise tries to read from the file and if not found it can return a default value. So it can read from the environment while in production and from the file while in development.
Follow the simple 3 step process below:
Step 1: Install python-decouple using pip.
pip install python-decouple
Step 2: Store your credentials separately.
First, we need to ‘decouple’ our credentials from our code repository into a separate file. If you are using a version control system say git make sure to add this file to .gitignore. The file should be in one of the following forms and should be saved at the repository’s root directory:
- .env -notice there is no name to the file
These are popular file formats to save the configuration for the project. The files follow the syntax:
KEY=YOUR_KEY -> without any quotes.
As a convention, we store the key name in all caps.
Here we are using .env format with the following credentials saved:
Now, we will .gitignore this file. To know more about .gitignore read this:
Step 3: Load your credentials securely.
We can cast values by specifying the “cast” parameter to the corresponding type. Also if we do not specify a default value and the config object cannot find the given key it will raise an “UndefinedValueError”.