Saturday 26 September 2020

Error calculating entropy over pandas series

I'm trying to calculate the entropy over a pandas series. Specifically, I group the strings in Direction as a sequence. Specifically, using this function:

diff_dir = df.iloc[0:,1].ne(df.iloc[0:,1].shift()).cumsum()

will return the count of strings in Direction that are the same until a change. So for each sequence of the same Direction string, I want to calculate the entropy of X,Y.

Using the code the sequencing of the same string is:

0    1
1    1
2    1
3    1
4    1
5    2
6    2
7    2
8    3
9    3

This code used to work but it's now returning an error. I'm not sure if this was after an upgrade.

import pandas as pd
import numpy as np

def ApEn(U, m = 2, r = 0.2):

    '''
    Approximate Entropy 

    Quantify the amount of regularity over time-series data.

    Input parameters:
    
    U = Time series
    m = Length of compared run of data (subseries length)
    r = Filtering level (tolerance). A positive number

    '''

    def _maxdist(x_i, x_j):
        return max([abs(ua - va) for ua, va in zip(x_i, x_j)])

    def _phi(m):
        x = [U.tolist()[i:i + m] for i in range(N - m + 1)] 
        C = [len([1 for x_j in x if _maxdist(x_i, x_j) <= r]) / (N - m + 1.0) for x_i in x]
        return (N - m + 1.0)**(-1) * sum(np.log(C))

    N = len(U)

    return abs(_phi(m + 1) - _phi(m))

def Entropy(df):

    '''
    Calculate entropy for individual direction
    '''

    df = df[['Time','Direction','X','Y']]
                                    
    diff_dir = df.iloc[0:,1].ne(df.iloc[0:,1].shift()).cumsum()

    # Calculate ApEn grouped by direction. 
    df['ApEn_X'] = df.groupby(diff_dir)['X'].transform(ApEn)
    df['ApEn_Y'] = df.groupby(diff_dir)['Y'].transform(ApEn)                 

    return df


df = pd.DataFrame(np.random.randint(0,50, size = (10, 2)), columns=list('XY'))
df['Time'] = range(1, len(df) + 1)

direction = ['Left','Left','Left','Left','Left','Right','Right','Right','Left','Left']
df['Direction'] = direction


# Calculate defensive regularity
entropy = Entropy(df)

Error:

return (N - m + 1.0)**(-1) * sum(np.log(C))
ZeroDivisionError: 0.0 cannot be raised to a negative power


from Error calculating entropy over pandas series

No comments:

Post a Comment