Hemant Vishwakarma: Augmenting Time Series Data for Deep Learning

Saturday, 24 October 2020

Augmenting Time Series Data for Deep Learning

If I want to apply deep learning to the dataset from the sensors that I currently possess, I would require quite a lot data, or we may see overfitting. Unfortunately, the sensors have only been active for a month and therefore the data requires augmentation. I currently have data in the form of a dataframe that can be seen below:

index   timestamp              cas_pre        fl_rat         ...
0       2017-04-06 11:25:00    687.982849     1627.040283    ...
1       2017-04-06 11:30:00    693.427673     1506.217285    ...
2       2017-04-06 11:35:00    692.686310     1537.114807    ...
....
101003  2017-04-06 11:35:00    692.686310     1537.114807    ...

Now I want to augment some particular columns with the tsaug package. The augmentation can be in the form of:

my_aug = (    
    RandomMagnify(max_zoom=1.2, min_zoom=0.8) * 2
    + RandomTimeWarp() * 2
    + RandomJitter(strength=0.1) @ 0.5
    + RandomTrend(min_anchor=-0.5, max_anchor=0.5) @ 0.5
)

The docs for the augmentation library proceed to use the augmentation in the manner below:

X_aug, Y_aug = my_aug.run(X, Y)

Upong further investigation on this site, it seems as though that the augmentation affects numpy arrays. While it states that it is a multivariate augmentation not really sure as to how that is happening effectively.

I would like to apply this consistent augmentation across the float numerical columns such as cas_pre and fl_rat in order not to diverge from the original data and the relationships between each of the columns too much. I would not like to appply it rows such as timestamp. I am not sure as to how to do this within Pandas.

from Augmenting Time Series Data for Deep Learning

Hemant Vishwakarma

Saturday, 24 October 2020

Augmenting Time Series Data for Deep Learning

No comments:

Post a Comment