While working with data, encountering time series data is very usual. Pandas is a very useful tool while working with time series data.
Pandas provide a different set of tools using which we can perform all the necessary tasks on date-time data. Let’s try to understand with the examples discussed below.
Working with Dates in Pandas
The date class in the DateTime module of Python deals with dates in the Gregorian calendar. It accepts three integer arguments: year, month, and day.
Python3
from datetime import date
d = date( 2000 , 9 , 17 )
print (d)
print ( type (d))
|
Output:
2000-09-17
<class 'datetime.date'>
Year, month, and day extraction
Retrieve the year, month, and day components from a Timestamp object.
Python3
import pandas as pd
timestamp = pd.Timestamp( '2023-10-04 15:30:00' )
year = timestamp.year
print (year)
month = timestamp.month
print (month)
day = timestamp.day
print (day)
|
Output:
2023
10
4
Weekdays and quarters
Determine the weekday and quarter associated with a Timestamp.
Python3
hour = timestamp.hour
print (hour)
minute = timestamp.minute
print (minute)
weekday = timestamp.weekday()
print (weekday)
quarter = timestamp.quarter
print (quarter)
|
Output:
15
30
2
4
Working with Time in Pandas
Another class in the DateTime module is called time, which returns a DateTime object and takes integer arguments for time intervals up to microseconds:
Python3
from datetime import time
t = time( 12 , 50 , 12 , 40 )
print (t)
print ( type (t))
|
Output:
12:50:12.000040
<class 'datetime.time'>
Time periods and date offsets
Create custom time periods and date offsets for flexible date manipulation.
Python3
time_period = pd.Period( '2023-10-04' , freq = 'M' )
year = time_period.year
print (year)
month = time_period.month
print (month)
quarter = time_period.quarter
print (quarter)
date_offset = pd.DateOffset(years = 2 , months = 3 , days = 10 )
new_timestamp = timestamp + date_offset
print (new_timestamp)
|
Output:
2023
10
4
2026-01-14 15:30:00
Handling Time Zones
Time zones play a crucial role in date and time data. Pandas provides mechanisms to handle time zones effectively:
- UTC and time zone conversion: Convert between UTC (Coordinated Universal Time) and local time zones.
- Time zone-aware data manipulation: Work with time zone-aware data, ensuring accurate date and time interpretations.
- Custom time zone settings: Specify custom time zone settings for data analysis and visualization.
Python3
import pandas as pd
timestamp = pd.Timestamp( '2023-10-04 15:30:00' ,
tz = 'America/New_York' )
print (timestamp)
utc_timestamp = timestamp.utcfromtz( 'America/New_York' )
print (utc_timestamp)
original_timestamp = utc_timestamp.tz_localize( 'America/New_York' )
print (original_timestamp)
datetime_index = pd.DatetimeIndex([ '2023-10-04' ,
'2023-10-11' ,
'2023-10-18' ],
tz = 'Asia/Shanghai' )
print (datetime_index)
utc_datetime_index = datetime_index.utcfromtz( 'Asia/Shanghai' )
print (utc_datetime_index)
original_datetime_index = utc_datetime_index.tz_localize(
'Asia/Shanghai' )
print (original_datetime_index)
|
Output:
Original Timestamp: 2023-10-04 15:30:00-04:00
UTC Timestamp: 2023-10-04 19:30:00+00:00
Original Timestamp (Back to America/New_York): 2023-10-04 15:30:00-04:00
Original DatetimeIndex: DatetimeIndex(['2023-10-04 00:00:00+08:00', '2023-10-11 00:00:00+08:00',
'2023-10-18 00:00:00+08:00'],
dtype='datetime64[ns, Asia/Shanghai]', freq=None)
UTC DatetimeIndex: DatetimeIndex(['2023-10-03 16:00:00+00:00', '2023-10-10 16:00:00+00:00',
'2023-10-17 16:00:00+00:00'],
dtype='datetime64[ns, UTC]', freq=None)
Original DatetimeIndex (Back to Asia/Shanghai): DatetimeIndex(['2023-10-04 00:00:00+08:00', '2023-10-11 00:00:00+08:00',
'2023-10-18 00:00:00+08:00'],
dtype='datetime64[ns, Asia/Shanghai]', freq=None)
Working with Date and Time in Pandas
Pandas provide convenient methods to extract specific date and time components from Timestamp objects. These methods include:
Step-1: Create a dates dataframe
Python3
import pandas as pd
data = pd.date_range( '1/1/2011' , periods = 10 , freq = 'H' )
data
|
Output:
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
'2011-01-01 02:00:00', '2011-01-01 03:00:00',
'2011-01-01 04:00:00', '2011-01-01 05:00:00',
'2011-01-01 06:00:00', '2011-01-01 07:00:00',
'2011-01-01 08:00:00', '2011-01-01 09:00:00'],
dtype='datetime64[ns]', freq='H')
Step-2: Create range of dates and show basic features
Python3
data = pd.date_range( '1/1/2011' , periods = 10 , freq = 'H' )
x = pd.datetime.now()
x.month, x.year
|
Output:
(9, 2018)
Datetime features can be divided into two categories. The first one time moments in a period and second the time passed since a particular period. These features can be very useful to understand the patterns in the data.
Step-3: Divide a given date into features –
pandas.Series.dt.year returns the year of the date time.
pandas.Series.dt.month returns the month of the date time.
pandas.Series.dt.day returns the day of the date time.
pandas.Series.dt.hour returns the hour of the date time.
pandas.Series.dt.minute returns the minute of the date time.
Refer all datetime properties from here.
Break date and time into separate features
Python3
rng = pd.DataFrame()
rng[ 'date' ] = pd.date_range( '1/1/2011' , periods = 72 , freq = 'H' )
rng[: 5 ]
rng[ 'year' ] = rng[ 'date' ].dt.year
rng[ 'month' ] = rng[ 'date' ].dt.month
rng[ 'day' ] = rng[ 'date' ].dt.day
rng[ 'hour' ] = rng[ 'date' ].dt.hour
rng[ 'minute' ] = rng[ 'date' ].dt.minute
rng.head( 3 )
|
Output:
date year month day hour minute
0 2011-01-01 00:00:00 2011 1 1 0 0
1 2011-01-01 01:00:00 2011 1 1 1 0
2 2011-01-01 02:00:00 2011 1 1 2 0
Step-4: To get the present time, use Timestamp.now() and then convert timestamp to datetime and directly access year, month or day.
Python3
t = pandas.tslib.Timestamp.now()
t
|
Output:
Timestamp('2018-09-18 17:18:49.101496')
Output:
datetime.datetime(2018, 9, 18, 17, 18, 49, 101496)
Step-5: Extracting specific components of datetime columne like date, time, day of the week for further analysis.
Python3
t.year
t.month
t.day
t.hour
t.minute
t.second
|
Output:
2018
8
25
15
53
Exploring UFO Sightings Over Time
Let’s analyze this problem on a real dataset uforeports.
Python3
import pandas as pd
df = pd.read_csv(url)
df.head()
|
Output:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00
The code is used to convert a column of time values in a Pandas DataFrame into the datetime format.
Python3
df[ 'Time' ] = pd.to_datetime(df.Time)
df.head()
|
Output:
City Colors Reported Shape Reported State \
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
Time
0 1930-06-01 22:00:00
1 1930-06-30 20:00:00
2 1931-02-15 14:00:00
3 1931-06-01 13:00:00
4 1933-04-18 19:00:00
The code is used to display the data types of each column in a Pandas DataFrame.
Output:
City object
Colors Reported object
Shape Reported object
State object
Time datetime64[ns]
dtype: object
The code is used to extract the hour details from a column of time data in a Pandas DataFrame.
Output:
0 22
1 20
2 14
3 13
4 19
Name: Time, dtype: int64
The code is used to retrieve the names of the weekdays for a column of date and time data in a Pandas DataFrame.
Python3
df.Time.dt.weekday_name.head()
|
Output:
0 Sunday
1 Monday
2 Sunday
3 Monday
4 Tuesday
Name: Time, dtype: object
The code is used to retrieve the ordinal day of the year for each date in a column of date and time data in a Pandas DataFrame.
Python3
df.Time.dt.dayofyear.head()
|
Output:
0 152
1 181
2 46
3 152
4 108
Name: Time, dtype: int64
Creating visualization to explore the frequency of UFO sightings by hour of the day.
Python3
df[ 'Time' ] = pd.to_datetime(df.Time)
df[ 'Hour' ] = df[ 'Time' ].dt.hour
plt.figure(figsize = ( 10 , 6 ))
plt.hist(df[ 'Hour' ], bins = 24 , range = ( 0 , 24 ), edgecolor = 'black' , alpha = 0.7 )
plt.xlabel( 'Hour of the Day' )
plt.ylabel( 'Number of UFO Sightings' )
plt.title( 'UFO Sightings by Hour of the Day' )
plt.xticks( range ( 0 , 25 ))
plt.grid( True )
plt.show()
|
Output:
Conclusion
Working with date and time data is an essential skill for data analysts and scientists. Pandas provides a comprehensive set of tools and techniques for effectively handling date and time information, enabling insightful analysis of time-dependent data. By mastering these techniques, you can gain valuable insights from time series data and make informed decisions in various domains.
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
05 Dec, 2023
Like Article
Save Article