Data Science Apps Using Streamlit
  • Last Updated : 05 Sep, 2020

Data visualization is one of the most important steps of data analysis. It is the way to convey your research and findings of data (set) through interactive plots and charts. There are many libraries that are available for data visualization like matplotlib, seaborn, etc. which allows us to visualize a large variety of charts and plots but these libraries do not offer any functionalities to deploy them in the form of a web page or web app.

Streamlit is an open-source Python library that makes it easy to build beautiful custom web-apps for machine learning and data science. In this post we will build a small demo application in streamlit but first, we need to get an idea about some important function that we are going to use

Important functions:

  • Streamlit.title ():  This function allows you to add the title of the app.
  • Streamlit.header()/ Streamlit.subheader(): These functions are used to set header/sub-header of a section. Markdown is also supported in these function.
  • Streamlit.write(): This function is used to add anything to a web app from formatted string to charts in matplotlib figure, Altair charts, plotly figure, data frame, Keras model, etc.
  • This function is used to display maps in the web app. However, it requires the values of latitude and longitude adn these values should not be null/NA.
  • There are also common UI widgets available in streamlit library such as streamlit.button(), streamlit.checkbox(),, etc.

Caching and Performance: 

When you mark a function with streamlit.cache() annotations, it tells Streamlit that whenever the function is called it should check three things:

  • The name of the function
  • The actual code that makes up the body of the function.
  • The input parameters that you called the function with.

streamlit.cache() has following important arguments:

  • func: The function that we are going to cache. Streamlit hashes the function and its dependent code.
  • persist: This argument is used to persist the cache in the browser


  • First, we need to install streamlit into our environment, we can do it by using pip install. For this post, we will be using UK accident data, which can be downloaded from here
pip install streamlit
  • First, we need to import the modules and load the data, we will be using to streamlit cache function to cache our data, so we don’t need to load the data again when we rerun again. After that, we use that data to plot different plot and maps. Below is the full code for the file.






# import the required modules
import streamlit as st
import pandas as pd
import numpy as np
import pydeck as pdk
import as px
# Dataset we need to import
# Add title and subtitle of the map.
st.title("Accidents in United Kingdom")
st.markdown("This app analyzes accident data in United Kingdom from 2012-2014")
Here, we define load_data function,
to prevent loading the data everytime we made some changes in the dataset.
We use streamlit's cache notation.
@st.cache(persist = True)
def load_data(nrows):
      # parse date and time columns as date and time
    data = pd.read_csv(DATA_URL, nrows = nrows, parse_dates =[['Date', 'Time']])
    # Drop N / A values in latitude and longitude. soit does not face problem when we use maps
    data.dropna(subset =['Latitude', 'Longitude'], inplace = True)
    lowercase = lambda x: str(x).lower()
    data.rename(lowercase, axis ="columns", inplace = True)
    data.rename(columns ={"date_time": "date / time"}, inplace = True)
    return data
# load first 10000 rows
data = load_data(10000)
# Plot : 1
# plot a streamlit map for accident locations.
st.header("Where are the most people casualties in accidents in UK?")
# plot the slider that selects number of person died
casualties = st.slider("Number of persons died", 1, int(data["number_of_casualties"].max()))"number_of_casualties >= @casualties")[["latitude", "longitude"]].dropna(how ="any"))
# Plot : 2
# plot a pydeck 3D map for the number of accident's happen between an hour interval
st.header("How many accidents occur during a given time of day?")
hour = st.slider("Hour to look at", 0, 23)
original_data = data
data = data[data['date / time'].dt.hour == hour]
st.markdown("Vehicle collisions between % i:00 and % i:00" % (hour, (hour + 1) % 24))
midpoint = (np.average(data["latitude"]), np.average(data["longitude"]))
    map_style ="mapbox://styles / mapbox / light-v9",
    initial_view_state ={
        "latitude": midpoint[0],
        "longitude": midpoint[1],
        "zoom": 11,
        "pitch": 50,
    layers =[
        data = data[['date / time', 'latitude', 'longitude']],
        get_position =["longitude", "latitude"],
        auto_highlight = True,
        radius = 100,
        extruded = True,
        pickable = True,
        elevation_scale = 4,
        elevation_range =[0, 1000],
# Plot : 3
# plot a histogram for minute of the hour atwhich accident happen
st.subheader("Breakdown by minute between % i:00 and % i:00" % (hour, (hour + 1) % 24))
filtered = data[
    (data['date / time'].dt.hour >= hour) & (data['date / time'].dt.hour < (hour + 1))
hist = np.histogram(filtered['date / time'].dt.minute, bins = 60, range =(0, 60))[0]
chart_data = pd.DataFrame({"minute": range(60), "Accidents": hist})
fig =, x ='minute', y ='Accidents', hover_data =['minute', 'Accidents'], height = 400)
# The code below uses checkbox to show raw data
st.header("Condition of Road at the time of Accidents")
select = st.selectbox('Weather ', ['Dry', 'Wet / Damp', 'Frost / ice', 'Snow', 'Flood (Over 3cm of water)'])
if select == 'Dry':
    st.write(original_data[original_data['road_surface_conditions']=="Dry"][["weather_conditions", "light_conditions", "speed_limit", "number_of_casualties"]].sort_values(by =['number_of_casualties'], ascending = False).dropna(how ="any"))
elif select == 'Wet / Damp':
    st.write(original_data[original_data['road_surface_conditions']=="Wet / Damp"][["weather_conditions", "light_conditions", "speed_limit", "number_of_casualties"]].sort_values(by =['number_of_casualties'], ascending = False).dropna(how ="any"))
elif select == 'Frost / ice':
    st.write(original_data[original_data['road_surface_conditions']=="Frost / ice"][["weather_conditions", "light_conditions", "speed_limit", "number_of_casualties"]].sort_values(by =['number_of_casualties'], ascending = False).dropna(how ="any"))
elif select == 'Snow':
    st.write(original_data[original_data['road_surface_conditions']=="Snow"][["weather_conditions", "light_conditions", "speed_limit", "number_of_casualties"]].sort_values(by =['number_of_casualties'], ascending = False).dropna(how ="any"))
    st.write(original_data[original_data['road_surface_conditions']=="Flood (Over 3cm of water)"][["weather_conditions", "light_conditions", "speed_limit", "number_of_casualties"]].sort_values(by =['number_of_casualties'], ascending = False).dropna(how ="any"))
if st.checkbox("Show Raw Data", False):
    st.subheader('Raw Data')


  • Now, save this code as “” and run this command as: streamlit run This will open a browser window and load the app. Let’s look at some screenshot from the app.



