Project Idea | Analysis of Emergency 911 calls using Association Rule Mining

Analysing emergency calls dataset and discovering hidden trends and patterns will help in ensuring that the emergency response team is better equipped to deal with emergencies.
Considering road accidents, fire accidents etc, high numbers in specific areas indicate that there is a high demand for ambulance services in those areas. Road accidents in some areas might be due to road conditions which need to be improved. High frequency of emergencies due to respiratory problems might be due to harmful pollutants in the air in that specific area. Association rule mining will thus help in discovering such patterns.

Proposed System
Pre-processing the dataset —> Association Rule Mining —> Extracting interesting patterns from the rules obtained —> Validation of the rules

Dataset Used
The dataset for analysis has been obtained from Kaggle. The dataset contains Emergency 911 calls in Montgomery County located in the Commonwealth of Pennsylvania. The attributes chosen include: type of emergency, time stamp, township where the emergency has occurred.

The rows with missing values are eliminated. Numerical values have to be converted to categorical for Association Rule Mining. Therefore, time stamp is converted to day of the week, month and time of day(Morning, Afternoon, Evening, Night).

Association Rule Mining
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness(Support, Confidence).

Choosing a suitable threshold for support and confidence:
Set minimum support higher for a very small database and lower for very large databases. If you set minimum support higher for small database, it is to ensure that item sets are significant. If you set minimum support lower for large databases, it is to ensure that enough item sets are found.

Lift : If a rule had a lift of 1, it would imply that the probability of occurrence of the antecedent and that of the consequent are independent of each other. When two events are independent of each other, no rule can be drawn involving those two events. If the lift is > 1, that lets us know the degree to which those two occurrences are dependent on one another, and makes those rules potentially useful for predicting the consequent in future data sets. If a rule has higher confidence and lower lift, intuitively it would seem that it is more valuable because of its higher confidence — it seems more accurate (better supported). But accuracy of the rule can be misleading. The value of lift is that it considers both the confidence of the rule and the overall data set.

Results and Validation of Rules Obtained
Set 1:
{Afternoon,December} => {Traffic: VEHICLE ACCIDENT}
{December,Evening} => {Traffic: VEHICLE ACCIDENT}
{December,Morning} => {Traffic: VEHICLE ACCIDENT}
{December} => {Traffic: VEHICLE ACCIDENT}

These rules indicate that a lot of accidents are likely to occur in December.
In winter (December–January–February), temperatures are 44 °F (7 °C) and 28 °F (?2 °C). During winters, due to sheets of ice in spots throughout the county, the number of vehicle accidents are high. According to National Highway Traffic Safety Administration (NHTSA), during the Christmas period, many fatalities involving an alcohol-impaired driver occur. All these accidents would certainly be lower if breathalyzer use was more widespread.

Set 2:

These rules indicate that Norristown, Pottstown may not be safe at nights and in the evenings.
According to reports, although overall crime rates have been slowly and steadily declining for more than a decade, mirroring a national trend, Norristown remains stubbornly ahead of neighbouring Montgomery County townships in violent crimes. The reasons for this may be — a more dense and less affluent population; the scourge of drugs; the challenges presented by rapidly changing demographics.

Set 3:
{EMS: OVERDOSE,Sunday} => {Night}
{EMS: OVERDOSE,Saturday} => {Night}
{EMS: OVERDOSE,Friday} => {Night}

These rules indicate that drug overdose may be high during weekends at night probably because people are more likely to party and take drugs at that time. Another reason for drug overdose, when a physician stops prescribing opioids or those drugs become too expensive, a patient may switch to heroin, which is relatively cheap and easy to obtain. Heroin(drug) is an opioid — a substance that reduces the intensity of pain signals in the body.
Rules indicate that Lower Merion, Cheltenham, Norristown, Pottstown(towns in Pennsylvania) get significant number of drug overdose emergency calls.
The 2015 state-wide drug overdose death rate in Pennsylvania was 26 per 100,000 people, an increase from the reported 2014 rate of 21 per 100,000 people. According to the CDC, the national drug overdose death rate in 2014 (most recent available) was 14.7 per 100,000 people. According to reports, overdoses have drastically increased across Montgomery County over the past year. The county undergoes the worst overdose epidemic in its history, according to information provided by the District Attorney’s Office.

Set 4:
{EMS: CVA/STROKE} => {Morning}

Rule indicates that stroke is likely to occur in the morning.
Scientific research says that you are more likely to suffer a stroke in the early morning than any other time, and this increased risk is linked to the body’s natural rhythms.
Circadian rhythms ( The circadian rhythm, present in humans and most other animals, is generated by an internal clock that is synchronized to light-dark cycles and other cues in an organism’s environment.) seem to play a part in blood pressure, body temperature, and other body functions. During the early morning, when blood pressure is higher, the risk for stroke appears to increase.

Set 5:
{EMS: FEVER} => {Night}
{EMS: FEVER} => {Morning}
{EMS: FEVER} => {Evening}

The first rule has higher confidence than other two.
The reason for this pattern in rules may be due to this: Body temperature usually follows a built-in 24-hour cycle. Its lowest point is between 3 and 6 a.m., followed by a peak between 4 and 11 p.m.
Two major factors regulate this cycle:
The hypothalamus has its own 24-hour hormone-secretion pattern.
The things the body does during the day (heartbeat, muscle movements, breathing) involve a release of heat energy, causing your core body temperature to warm up as the day progresses.
This explains why your temperature increases toward the end of the day under normal conditions. However, this cycle still happens when you have a fever. The difference is that now, the temperature elevation is more obvious since you’re already starting from a higher temperature than normal.

Set 6:
{EMS: HEAT EXHAUSTION,Evening,Thursday} => {August}
{Afternoon,EMS: HEAT EXHAUSTION,Monday} => {July}
{EMS: HEAT EXHAUSTION,Evening,Monday} => {June}

Rules indicate that heat exhaustion is high in Abington in the months of August, June during the day.
Temperatures soar high during the day in summer (June, July, August) leading to heat exhaustions. Educating people to practice heat safety measures(Do not spend too much time in the hot sun, do outside work during early morning or late evening hours, wear lightweight, light coloured, loose fitting clothes, take frequent breaks in the shade, keep well hydrated.) will help reduce these emergencies.

Tools Used
R and RStudio
• arules package is required for association rule mining.
• arulesViz package is useful for visualising the results.

This project is useful for the emergency response team of every country.

This article is contributed by Brinda. M. If you like GeeksforGeeks and would like to contribute, you can also write an article using or mail your article to See your article appearing on the GeeksforGeeks main page and help other Geeks.

GATE CS Corner    Company Wise Coding Practice

Recommended Posts:

Writing code in comment? Please use, generate link and share the link here.