Open In App

Storytelling in Data Science

Last Updated : 21 Dec, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Data science primarily revolves around extracting meaningful insights from vast datasets, Data-science storytelling takes the world of data analysis and adds the storytelling touch to it.

In this article, we will learn How Data Storytelling works in data science, How it helps to visualize data, How to make a good data story, and Future of the Data Storytelling.

What is Storytelling?

Now having said that, the stories with which you and I can relate are interesting, and the same applies to data. The story about data is you have to present facts about the data in a story way, in a way that a story is being told to someone so that it is very interesting and holds the attention of the audience. So which story holds the attention of the audience, we will discuss it in the next section.

Characteristics of a Good Story

There are three main essential characteristics of a good story.

  • The story should have a flow
  • It should be Interesting
  • It should give a message in the end

So for creating a good story, we need to keep some points in mind, let us understand them in detail in the next section

How to Create Data Stories?

Here are some tips you can follow to create engaging and informative data stories that effectively communicate insights, trends and findings from your data

  • Know your Objective: Begin by clearly stating the goal of your data story. What do you want to convey or achieve with your data ? Is it to inform ,persuade , or educate your audience? Understanding your objective is crucial for crafting a compelling narrative
  • Know your Audience: Understand who your audience is and what they are interested in . Tailor your data story to their level of expertise , interests and needs . This will help you create a more relevant and engaging story
  • Data Gathering and preparation: Collect and prepare the necessary data . Make sure that data is accurate , current and applicable to your goal . It may be necessary to clean , transform and analyze the data in order to derive insights from the dataset
  • Identify Key Insights: Discover meaningful insights , trends , patterns, or anomalies within your data . These valuable findings will form the foundation of your data narrative . Utilize data visualization tools and statistical analysis techniques to revealing these valuable insights
  • Set a narrative: Structure your data story in a proper flow , with a clear beginning , middle and end. Begin by introducing the topic and setting the stage . Then , in the middle , present the main insights and findings . Finally , end with a summary and discuss the implications. Engage your audience by using storytelling techniques
  • Focus on data visualization: Select the appropriate data visualization techniques to convey your insights efficiently. Use charts , graphs , maps and infographics to make your data visually appealing and easy to understand . Do ensure that visualisation aligns with your narrative.
  • Share with Others: Consider distributing your data story through suitable channels like reports , presentations , websites or social media platforms . Take into account the most suitable format for your audience , whether be it a PDF report , a live presentation or an interactive online dashboard

Components of Data Storytelling

Data storytelling consists of three primary elements: data, visuals, and narrative. Let’s delve deeper into each of them below.

  1. Data : Data storytellers collect and preprocess the necessary data to narrate a story. They conduct statistical analysis and visually represent significant trends and patterns for comprehensive data examination.
  2. Narrative : Data storytellers collect and preprocess the data to narrate a story. They conduct statistical analysis and visually represent significant trends and patterns for comprehensive data examination.
  3. Visuals : A picture carries significant meaning. Adding visuals enhances the storytelling and makes the data more impactful. Visuals can include graphs, images, or videos.

Types of Data

In today’s world where tons of data is being generated and consumed in a second . It is a bit difficult to categorize it . However some of the categories under which it can be classified are as follows . Have a look at the image given below you will have a better understanding

Data is mainly classified into two categories :

  • Qualitative Data
  • Quantitative Data

1. Quantitative Data

This type of data answer questions like how much , how many , no of times . It is basically the representation of data through numerical figures. For a better understanding , have a look at the following examples.

  1. The Burj Khalifa is the tallest building in the world having a total height of 2717 ft.
  2. Mukesh drives his car at a speed of 90km/hr
  3. Today’s temperature in shimla is recorded very low , (3°C) by the weather department
  4. Raghav has scored highest marks (97/100) in his unit test of chemistry
  5. Sunil was declared overweight by the doctor as his weight turned out to be 95 kg

These are all real life examples . Here height , speed , temperature , marks and weight represents the numerical value of the quantity given

2. Qualitative Data

This type of data cannot be measured through numbers . Or we can also say that , any data which comes out of the bracket of quantitative data is termed as qualitative data . Here are some examples

  1. Aman was feeling sad today
  2. Riya hair is brown in colour
  3. Children went to zoo and captured photos with animal
  4. Researchers documented the results of experiment in the manual
  5. They went to watch the movie ‘Section 375’ . The genre of the movie was thriller .

Have a closer look at the above examples , you will not find any data which shows the value numerically , this is what qualitative data , it is a collection of images , videos , case studies , emotions , genre , etc . Basically anything which can;t be represented in number

Data-Storytelling-2

Types of Data Storytelling

Now the Quantitative Data and Qualitative Data , is also further divided into sub types . They are as follows

Quantitative Data

  1. Discrete Data
  2. Continuous Data

Qualitattive Data

  1. Nominal Data
  2. Ordinal Data

1) Discerte Data

This type of data represents the information that can be counted and measured in a limited number of separate values. The values are usually whole numbers and comes through counting/categorizing them. This type of data is actually a bit different from continuos data , which holds unlimited value swithin a specified range Example

  • Number of students present in a class : You can count the exact number of students in a class . They are counted in whole number .
  • Number of bikes present in a parking : You can get the exact value of no of bikes present in the parking . And again , it is a whole number
  • Number of cards present in a deck : The total no of cards present in a deck is 52, and the total value is in whole number
  • Number of emails received in a day : If you are student/working professional , you will be receiving emails daily . And that can also be counted , as they are present in a whole number .

Example : Look at the graph below , it shows the no of cars parked in a office , on different time . In the morning 8: 00 am , as the office time does not start , the count of cars is 0 , as the day progresses around 10 : 00 am , the count of car gets increased to 5, (office time starts) . Around 2 : 00 pm the no of cars is maximum , as it includes existing no of people along with the one whose shifts starts after 12 pm , if you look at the time after 4 : 00 pm no of cars decrease as mostly people start leaving the office .

Data-Storytelling-5-(2)

Discrete Data

2) Continuous Data

This type of data , holds the information that can have any value within a specific range . Unlike the discrete data which has separate and distinct values , continuous data can be divided into smaller and more precise values Example

  • Height of a person : A person height can vary within a specific range and can be accurately measured.
  • Weight of an object : The weight of an object , can change continuously within a certain range and it can be measured
  • Temperature : Temperature can have any value within a specific range, and it can be measured with different levels of accuracy.
  • Time : Time is commonly seen as continuous because it can be divided into increasingly smaller units without any noticeable breaks between them.
    Example : Look at the graph below , it shows the temperature of top 5 metro cities ( Mumbai , Delhi , Ahmedabad , Bengaluru ) upto 10 hours . You will find that temperature lies between the range of 20°C. to 30°C . As the line is pretty much straight . There has not been much drop or increase in the temperature.

1.) Nominal Data

Nominal data is a form of categorical data that represents categories or labels without any inherent order or ranking. In nominal data, the categories are separate and do not have a specific order or numerical value assigned to them. This kind of data is qualitative in nature , and the categories are used to classify items or observations into groups . Example

  • Colors: Categories such as red, blue, yellow, etc., are considered nominal since they do not have any inherent order
  • Gender: Male and female are nominal categories that represent separate groups, but there is no inherent ranking or order between them.
  • Types of fruits: Categories like watermelon, orange, banana, etc., are nominal as they represent various types of fruits without any specific order.
  • Marital Status : Marital status such as married, single, divorced, and widowed are considered nominal categories because they do not have any specific order.
  • Types of Car : Car brands like Hyundai, Suzuki, Honda, and others are considered nominal categories as they group cars based on their brand without any specific ranking.

Example : Consider this graph , it categorizes no of electronic devices owned by different people in a particular company . You will find that maximum people are owing smartphone , where as smart tv holds the minimum count.

2.) Ordinal Data

Ordinal data is a special kind of categorical data that shows categories with a natural order or ranking. Unlike nominal data, where categories have no specific order, ordinal data allows for a meaningful comparison of values based on their relative position or rank. Examples

  • Education Levels : High School Program , Associate Program , Bachelors Program , Masters Program ,
  • Movie Ratings : 1 star , 2 star , 3 star , 4 star , 5 star
  • Exercise Frequency : Rarely , Occasionally , Regulary , Frequently
  • Temperature Levels : Very Cold , Cold , Moderate , Warm , Hot
  • Educational Grades : A , B , C , D , F

Example :

Consider this graph given below , it shows the data of no of people who have visited the shopping store . Upon feedback received from the customers , As the ordinal data is a type of categorical data , satisfaction level ranges from very dissatisfied to very satisfied

Data-Storytelling-4

Ordinal Data

What Makes a Good data story?

Most marketers today have access to a wealth of data, allowing them to elevate storytelling. By understanding their audiences’ interests, behavior’s, and motivations, they can create genuine messages that resonate with individuals on a personal level. Brands are now prioritizing the understanding of narratives that truly connect with their target audience, as many consumer purchases are driven by emotions. Consumers expect content that is not only personalized but also relevant and meaningful to them. Brands can take advantage of this by speaking their language and establishing a genuine connection.

Types of Charts for Data Visualization

Charts are crucial when working with data as they help simplify large amounts of information into a clear format. Visualizing data can reveal insights to newcomers and effectively communicate findings to those who don’t have access to the raw data. With numerous chart types available, the challenge lies in determining the most suitable one for the specific task.

  • Bar Chart: In a bar chart, values are indicated by the length of bars, each of which corresponds with a measured group. Bar charts can be oriented vertically or horizontally; vertical bar charts are sometimes called column charts. Horizontal bar charts are a good option when you have a lot of bars to plot, or the labels on them require additional space to be legible. For example consider this image (given below) , it shows the % of no of people who are concerned about their data loss through various ways like credit/debit card , Password , ID , Personal email , Browser History.
  • Pie Chart: A circular diagram, known as a pie chart, visually represents the distribution of a categorical variable by dividing it into radial slices. Each slice corresponds to a unique categorical value, and its size, measured in both area and arc length, reflects the proportion of the total amount that each category level represents. For example in this image , the pie chart shows the % of people who prefer transportation through different modes like Cycle , Two wheeler , Walk and Bus . 38% people out of the total prefer going by bicycle and 12% which is the minimum prefer going by bus
    Data-Storytelling-3-(4)

    Types of Transportation

  • Line Chart: Line charts display variations in value over continuous measurements, like those taken over time. The upward or downward movement of the line highlights positive or negative changes, respectively. Additionally, it reveals general patterns, aiding the reader in making forecasts or projections for future results. For example in this case the line chart shows the beverages consumed in a particular week . Three beverages are given tea , coffee and beer , and the line chart shows which type of beverage has been consumed maximum or minimum during the week.
Data-Storytelling-6-(1)

LIne Chart

  • Pyramid Chart: A pyramid chart is a visual way to display data in the shape of a pyramid. It is commonly used to compare hierarchical data that decreases in size as it moves up the pyramid. Pyramid charts are useful for illustrating the distribution of data across various categories or levels, with the widest part of the pyramid representing the largest value or category and the narrowest part representing the smallest value or category. They are used for showing population distribution , sales or revenue distribution or organizational hierchay . In the picture given below pyramid chart represents data of people who are active music listeners . The widest part shows gym where as the narrower part shows cooking which means that people listen to music more often when they do their workout as compared to when they are doing cooking

Data-Storytelling-1

Pyramid Chart

  • Gauge Chart: A Gauge chart is a chart that uses a radial scale to show data as a dial. It has a strong visual impact, and the needle on the dial clearly indicates where the value falls within the predefined scale. The scale in a gauge chart is divided into ranges to provide a more detailed representation of the data. These ranges can be color-coded to quickly determine if the value is below or above the expected or standard range.
    Apart from all the all the charts discussed above there are various types of charts and graphs available like area charts , funnel charts , cone charts , histogram , maps , etc

Future of Data Storytelling

Several key trends and developments are expected to shape the future of data storytelling.

  • Data visualisation tools are evolving with the integration of augmented and virtual reality technologies, resulting in augmented data visualisation. This allows users to engage with data in immersive ways, making data storytelling more compelling and accessible.
  • As AI and machine learning advance, they will become increasingly important in analyzing and interpreting data. This could result in automated data storytelling, where AI generates insights and narratives from raw data, enabling quicker and more data-focused decision-making.
  • Users will be able to explore data stories in a nonlinear and personalized way as interactive storytelling tools become more advanced. This may include dynamic dashboards, interactive infographics, and adaptable storytelling that responds to user input and preferences.
  • In light of increasing concerns regarding data privacy and ethics, data storytelling must tackle these matters. Future data storytellers must be cautious of data privacy regulations and ethical considerations while presenting data to the public.

Conclusion

Data storytelling is a powerful method of effectively sharing insights, trends, and findings from data. It involves presenting data in a narrative format that is both informative and engaging for the audience. A successful data story should have a clear flow, be interesting, and convey a meaningful message. To create a compelling data story, one must first define the objective, understand the audience, gather and prepare the data, identify key insights, and structure the narrative with a clear beginning, middle, and end.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads