Tuesday 3 August 2021

scipy.stats cdf greater than 1

I'm using scipy.stats and I need the CDF up to a given value x for some distributions, I know PDFs can be greater than 1 because they are not probabilities but densities so they should integrate to 1 even if specific values are greater, but CDFs should never be greater than 1 and when running the cdf function on scipy.stats sometimes I get values like 2.89, i'm completely sure i'm using cdf and not pdf(that was my first guess), this is messing my results and algorithm because I need accumulated probabilities, why is scipy.stats cdf returning values greater than 1 and/or how should I proceed to fix it?

Code for reproducing the issue with a sample distribution and parameters(but it happens with others too):

from scipy import stats
distribution = stats.gausshyper
params = [9.482986347673158, 16.65813644507513, -38.11083665959626, 16.08698932118982, -13.387170754433273, 18.352117022674125]
test_val = [-0.512720,1,1]

arg = params[:-2]
loc = params[-2]
scale = params[-1]

print("cdf:",distribution.cdf(test_val,*arg, loc=loc,scale=scale))
print("pdf:",distribution.pdf(test_val,*arg, loc=loc,scale=scale))

cdf: [2.68047481 7.2027761 7.2027761 ] pdf: [2.76857133 2.23996739 2.23996739]



from scipy.stats cdf greater than 1

No comments:

Post a Comment