Tuesday 5 January 2021

Subset df using timestamps - Python

I'm hoping to subset a df using specific timestamps plus an additional period of time. Using below, df contains specific timestamps that I want to use to subset df2. Essentially, I use the timestamps in df and determine the previous minute. These periods of time are then used to create individual df's, which are concatenated together to create the final df.

However, this is inefficient by itself, but becomes even more so when dealing with multiple times.

import pandas as pd

df = pd.DataFrame({   
        'Time' : ['2020-08-02 10:01:12.5','2020-08-02 11:01:12.5','2020-08-02 12:31:00.0','2020-08-02 12:41:22.6'],             
        'ID' : ['X','Y','B','X'],                 
    })

# 1 min before timestamp
'2020-08-02 10:00:12.5' 
# first timestamp
'2020-08-02 10:01:12.5' 

# 1 min before timestamp
'2020-08-02 11:00:02.1' 
# second timestamp
'2020-08-02 11:01:02.1' 

 df2 = pd.DataFrame({   
        'Time' : ['2020-08-02 10:00:00.1','2020-08-02 10:00:00.2','2020-08-02 10:00:00.3','2020-08-02 10:00:00.4'],             
        'ID' : ['','','',''],                 
    })

d1 = df2[(df2['Time'] > '2020-08-02 10:00:12.5') & (df2['Time'] <= '2020-08-02 10:01:12.5')]
d2 = df2[(df2['Time'] > '2020-08-02 11:00:02.1') & (df2['Time'] <= '2020-08-02 11:01:02.1')]

df_out = pd.concat([d1,d2])#...include all separate periods of time

Intended Output:

                    Time ID
   2020-08-02 10:00:12.5  
   2020-08-02 10:00:12.6  
...
   2020-08-02 11:01:12.5  X
   2020-08-02 11:00:02.1
   2020-08-02 11:00:02.2
...
   2020-08-02 11:01:02.1  Y


from Subset df using timestamps - Python

No comments:

Post a Comment