Plotting Geospatial Data using GeoPandas
GeoPandas is an open source tool to add support for geographic data to Pandas objects. In this, article we are going to use GeoPandas and Matplotlib for plotting geospatial data.
We are going to install GeoPandas, Matplotlib, NumPy and Pandas.
pip install geopandas pip install matplotlib pip install numpy pip install pandas
Note: If you don’t want to install these modules locally on your computer, use Jupyter Notebook or Google Colab.
Importing modules and dataset
We are going to import Pandas for the dataframe data structure, NumPy for some mathematical functions, GeoPandas for supporting and handling geospatial data and Matplotlib for actually plotting the maps.
import pandas as pd import geopandas as gpd import numpy as np import matplotlib.pyplot as plt
GeoPandas gives us some default datasets along with its installation to play around with. Let’s read one of the datasets.
Some of the other datasets to play with are ‘naturalearth_cities’ and ‘nybb’. Feel free to experiment with them later. We can use world and plot the same using Matplotlib.
Analyse the datasets
Now, if we see world, we have a lot of fields. One of them is GDP estimate(or gdp_md_est). However, to show how easily data can be filtered in or out in pandas, let’s filter out all continents except Asia.
cmap property is used to plot the data in the shade specified. The darker shades mean higher value while the lighter shades means lower value. Now, let’s analyse the data for population estimate(pop_est).
The above image is not very good in conveying the data. So let’s change some properties to make it more comprehensible. First, let’s increase the size of the figure and then set an axis for it. We first plot the world map without any data to on the axis and then we overlay the plot with the data on it with the shade red. This way the map is more clear and dark and makes the data more understandable. However, this map is still a little vague and won’t tell us what the shades mean.
Let’s import the toolkits that allow us to make dividers within the plot. After this we are going to plot the graph as we did before, but this time we are going to add a facecolor. The facecolor property is going to change the background to a color it is set to(in this case, light blue). Now we need to create a divider for creating the color box within the graph, much like dividers in HTML. We are creating a divider and setting its properties like size, justification etc.
Then we need to create the color box in the divider we created. So obviously, the highest value in the color box is going to be the highest population in the dataset and the lowest value is going to be zero.
Thus in this article we have seen how we can use GeoPandas to get geospatial data and plot it using Matplotlib. Custom datasets can be used to analyse specific data and city-wise data can also be used. Also, GeoPandas can be used with Open Street Maps, which provides very specific geospatial data(example, streets, hospitals in a city etc., ). The same knowledge can be extended further and can be used for specific statistical and data analysis.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course