I'm trying to calculate the entropy over a pandas series. Specifically, I group the strings in Direction
as a sequence. Specifically, using this function:
diff_dir = df.iloc[0:,1].ne(df.iloc[0:,1].shift()).cumsum()
will return the count of strings in Direction
that are the same until a change. So for each sequence of the same Direction
string, I want to calculate the entropy of X,Y
.
Using the code the sequencing of the same string is:
0 1
1 1
2 1
3 1
4 1
5 2
6 2
7 2
8 3
9 3
This code used to work but it's now returning an error. I'm not sure if this was after an upgrade.
import pandas as pd
import numpy as np
def ApEn(U, m = 2, r = 0.2):
'''
Approximate Entropy
Quantify the amount of regularity over time-series data.
Input parameters:
U = Time series
m = Length of compared run of data (subseries length)
r = Filtering level (tolerance). A positive number
'''
def _maxdist(x_i, x_j):
return max([abs(ua - va) for ua, va in zip(x_i, x_j)])
def _phi(m):
x = [U.tolist()[i:i + m] for i in range(N - m + 1)]
C = [len([1 for x_j in x if _maxdist(x_i, x_j) <= r]) / (N - m + 1.0) for x_i in x]
return (N - m + 1.0)**(-1) * sum(np.log(C))
N = len(U)
return abs(_phi(m + 1) - _phi(m))
def Entropy(df):
'''
Calculate entropy for individual direction
'''
df = df[['Time','Direction','X','Y']]
diff_dir = df.iloc[0:,1].ne(df.iloc[0:,1].shift()).cumsum()
# Calculate ApEn grouped by direction.
df['ApEn_X'] = df.groupby(diff_dir)['X'].transform(ApEn)
df['ApEn_Y'] = df.groupby(diff_dir)['Y'].transform(ApEn)
return df
df = pd.DataFrame(np.random.randint(0,50, size = (10, 2)), columns=list('XY'))
df['Time'] = range(1, len(df) + 1)
direction = ['Left','Left','Left','Left','Left','Right','Right','Right','Left','Left']
df['Direction'] = direction
# Calculate defensive regularity
entropy = Entropy(df)
Error:
return (N - m + 1.0)**(-1) * sum(np.log(C))
ZeroDivisionError: 0.0 cannot be raised to a negative power
from Error calculating entropy over pandas series
No comments:
Post a Comment