Pandas | Basic of Time Series Manipulation

Although time series is also available in scikit-learn but Pandas has some sort of complied more features. In this module of Pandas, we can include the date and time for every record and can fetch the records of dataframe. We can find out the data within a certain range of date and time by using pandas module named Time series. Let’s discuss some major objectives to introduce the pandas time series analysis.

Objectives of time series analysis

  • Create the series of date
  • Work with data timestamp
  • Convert string data to timestamp
  • Slicing of data using timestamp
  • Resample your time series for different time period aggregates/summary statistics
  • Working with missing data

Now, let’s do some practical analysis on some data to demonstrate the use of pandas time series.

Code #1:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
from datetime import datetime
import numpy as np
  
range_date = pd.date_range(start ='1/1/2019', end ='1/08/2019'
                                                   freq ='Min')
print(range_date)

chevron_right


Output:

DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 00:01:00',
               '2019-01-01 00:02:00', '2019-01-01 00:03:00',
               '2019-01-01 00:04:00', '2019-01-01 00:05:00',
               '2019-01-01 00:06:00', '2019-01-01 00:07:00',
               '2019-01-01 00:08:00', '2019-01-01 00:09:00',
               ...
               '2019-01-07 23:51:00', '2019-01-07 23:52:00',
               '2019-01-07 23:53:00', '2019-01-07 23:54:00',
               '2019-01-07 23:55:00', '2019-01-07 23:56:00',
               '2019-01-07 23:57:00', '2019-01-07 23:58:00',
               '2019-01-07 23:59:00', '2019-01-08 00:00:00'],
              dtype='datetime64[ns]', length=10081, freq='T')

Explanation:
Here in this code, we have created the timestamp on the bases of minutes for date ranges from 1/1/2019 – 8/1/2019. We can vary the frequency by hours to minutes or seconds. This function will help you to tack the record of data stored per minute. As we can see in the output the length of the datetime stamp is 10081. Remember pandas use data type as datetime64[ns].

Code #2:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
from datetime import datetime
import numpy as np
  
range_date = pd.date_range(start ='1/1/2019', end ='1/08/2019'
                                                   freq ='Min')
print(type(range_date[110]))

chevron_right


Output:

<class 'pandas._libs.tslibs.timestamps.Timestamp'>

Explanation:
We are checking the type of our object named range_date.

Code #3:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
from datetime import datetime
import numpy as np
  
range_date = pd.date_range(start ='1/1/2019', end ='1/08/2019',
                                                   freq ='Min')
  
df = pd.DataFrame(range_date, columns =['date'])
df['data'] = np.random.randint(0, 100, size =(len(range_date)))
  
print(df.head(10))

chevron_right


Output:

                  date  data
0 2019-01-01 00:00:00    49
1 2019-01-01 00:01:00    58
2 2019-01-01 00:02:00    48
3 2019-01-01 00:03:00    96
4 2019-01-01 00:04:00    42
5 2019-01-01 00:05:00     8
6 2019-01-01 00:06:00    20
7 2019-01-01 00:07:00    96
8 2019-01-01 00:08:00    48
9 2019-01-01 00:09:00    78

Explanation:

We have first created a time series then converted this data into dataframe and use random function to generate the random data and map over the dataframe. Then to check the result we use print function.
In order to do time series manipulation, we need to have a datetime index so that dataframe is indexed on the timestamp. Here, we are adding one more new column in pandas dataframe.

Code #4:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
from datetime import datetime
import numpy as np
  
range_date = pd.date_range(start ='1/1/2019', end ='1/08/2019',
                                                  freq ='Min')
  
df = pd.DataFrame(range_date, columns =['date'])
df['data'] = np.random.randint(0, 100, size =(len(range_date)))
  
string_data = [str(x) for x in range_date]
print(string_data[1:11])

chevron_right


Output:

[‘2019-01-01 00:01:00’, ‘2019-01-01 00:02:00’, ‘2019-01-01 00:03:00’, ‘2019-01-01 00:04:00’, ‘2019-01-01 00:05:00’, ‘2019-01-01 00:06:00’, ‘2019-01-01 00:07:00’, ‘2019-01-01 00:08:00’, ‘2019-01-01 00:09:00’, ‘2019-01-01 00:10:00’]

Explanation:
This code just use the elements of data_rng and converted to string and due to lot of data we slice the data and print first ten values list string_data. By using the for each loop in list, we got all the values that are in the series range_date. When we are using date_range we always have to specify the start and end date.

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
from datetime import datetime
import numpy as np
  
range_data = pd.date_range(start ='1/1/2019', end ='1/08/2019'
                                                  freq ='Min')
  
df = pd.DataFrame(range_data, columns =['date'])
df['data'] = np.random.randint(0, 100, size =(len(range_data)))
  
df['datetime'] = pd.to_datetime(df['date'])
df = df.set_index('datetime')
df.drop(['date'], axis = 1, inplace = True)
  
print(df['2019-01-05'][1:11])

chevron_right


Output:

                     data
datetime                 
2019-01-05 00:01:00    99
2019-01-05 00:02:00    21
2019-01-05 00:03:00    29
2019-01-05 00:04:00    98
2019-01-05 00:05:00     0
2019-01-05 00:06:00    72
2019-01-05 00:07:00    69
2019-01-05 00:08:00    53
2019-01-05 00:09:00     3
2019-01-05 00:10:00    37


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.