Thursday 22 October 2020

Scipy - How to fit this beta distribution using Python Scipy Curve Fit

I am fairly new to curve_fit with scipy. I have many distributions that look like y and do not look like y. Most of the distributions that look like y are beta distributions. My approach is that if I can fit the beta function on all of my unique IDs that have varying distributions, I can find the coefficients from the beta function, then look at coefficients that are close in magnitude, then I can effectively filter out all distributions that look like y.

y looks like this (same data in example code below):

enter image description here

However, I am having some trouble getting started.

y = array([[ 0.50423378,  0.50423378,  0.50423378,  0.50254455,  0.50423378, 0.50254455,  0.50423378,  0.50507627,  0.50507627,  0.50423378,0.50507627,  0.50507627,  0.50423378,  0.50423378,  0.50423378, 0.50423378,  0.50423378,  0.50423378,  0.50254455,  0.50254455, 0.50254455,  0.50423378,  0.50423378,  0.50507627,  0.50507627,0.50507627,  0.50507627,  0.50507627,  0.50423378,  0.50423378, 0.50423378,  0.50507627,  0.50507627,  0.50423378,  0.50507627, 0.50507627,  0.50507627,  0.50423378,  0.50423378,  0.50423378,0.50423378,  0.50423378,  0.50254455,  0.50254455,  0.5, 0.50254455,  0.50254455,  0.50254455,  0.50423378,  0.50423378,0.50423378,  0.50423378,  0.50423378,  0.50254455,  0.50423378, 0.50254455,  0.50254455,  0.50423378,  0.50423378,  0.50254455,0.5       ,  0.5       ,  0.50254455,  0.50254455,  0.5       ,0.49658699,  0.49228746,  0.49228746,  0.48707792,  0.48092881,0.48707792,  0.48092881,  0.48092881,  0.48092881,  0.48092881,0.48092881,  0.48092881,  0.47380354,  0.47380354,  0.48092881,0.48707792,  0.48707792,  0.48092881,  0.48092881,  0.48092881,0.48092881,  0.48092881,  0.48092881,  0.47380354,  0.48092881,0.48092881,  0.48092881,  0.48707792,  0.48707792,  0.48707792,0.49228746,  0.49228746,  0.49228746,  0.49228746,  0.48707792,0.48707792,  0.48707792,  0.49228746,  0.48707792,  0.48707792,0.48707792,  0.48707792,  0.48707792,  0.49228746,  0.49228746,0.48707792,  0.48707792,  0.49228746,  0.49658699,  0.49658699,0.49658699,  0.49228746,  0.49228746,  0.49658699,  0.49228746,0.49658699,  0.5       ,  0.50254455,  0.50423378,  0.50423378,0.50254455,  0.50423378,  0.50423378,  0.50254455,  0.5       ,0.5       ,  0.5       ,  0.5       ,  0.5       ,  0.50254455,0.50254455,  0.5       ,  0.50254455,  0.5       ,  0.5       ,0.5       ,  0.5       ,  0.5       ,  0.5       ,  0.49658699,0.49228746,  0.48707792,  0.48707792,  0.48707792,  0.49228746,0.49228746,  0.48707792,  0.48707792,  0.49228746,  0.48707792,0.48707792,  0.48707792,  0.48092881,  0.48092881,  0.48707792,0.48707792,  0.48092881,  0.47380354,  0.48092881,  0.48092881,0.48707792,  0.49228746,  0.48707792,  0.49228746,  0.48707792,0.48092881,  0.47380354,  0.46565731,  0.46565731,  0.46565731,0.45643546,  0.45643546,  0.45643546,  0.45643546,  0.45643546,0.45643546,  0.45643546,  0.46565731,  0.45643546,  0.45643546,0.45643546,  0.44607129,  0.45643546,  0.45643546,  0.45643546,0.44607129,  0.44607129,  0.43448304,  0.43448304,  0.43448304,0.44607129,  0.45643546,  0.45643546,  0.45643546,  0.46565731,0.47380354,  0.48092881,  0.48092881, 29.38186886, 29.38186886,29.38186886, 29.37898909, 29.45299206, 29.52449116, 29.74083063,29.73771398, 29.73771398, 29.74083063, 29.74083063, 29.74083063,29.74083063, 29.73771398, 29.74083063, 29.73771398, 29.73771398,29.73771398, 29.73771398, 29.74083063, 29.74083063, 29.74083063,30.12527698, 30.48367189, 30.8169243 , 30.8169243 , 30.8169243 ,30.8169243 , 30.82153203, 30.8169243 , 30.81230208, 30.81230208,30.80766536, 30.81230208, 30.81230208, 30.80766536, 30.80301414,30.80301414, 30.80301414, 30.80301414, 30.80301414, 30.80766536,30.81230208, 30.81230208, 30.81230208, 30.81230208, 30.8169243 ,30.82153203, 30.82612528, 10.51949923, 10.51949923, 10.51436497,10.51436497, 10.22456193,  9.91464422,  9.36922158,  9.37416663,9.36922158,  9.36922158,  9.36922158,  9.37416663,  9.37906375,9.383913  ,  9.383913  ,  9.38871446,  9.383913  ,  9.37906375,9.37416663,  9.36922158,  9.36422851,  9.35918734,  7.72711675,5.53121937,  0.5       ,  0.50254455,  0.50254455,  0.50254455,0.50254455,  0.50254455,  0.5       ,  0.5       ,  0.49658699,0.5       ,  0.5       ,  0.5       ,  0.49658699,  0.49658699,0.5       ,  0.50254455,  0.50423378,  0.50423378,  0.50423378,0.50507627,  0.50507627,  0.50423378,  0.50423378,  0.50423378,0.50423378,  0.50423378,  0.50254455,  0.50254455,  0.5       ,0.5       ,  0.5       ,  0.49658699,  0.5       ,  0.49658699,0.49658699,  0.49658699,  0.49658699,  0.49658699,  0.49658699,0.49658699,  0.49658699,  0.49228746,  0.48707792,  0.48707792,0.48092881,  0.47380354,  0.47380354,  0.46565731,  0.46565731,0.47380354,  0.46565731,  0.47380354,  0.47380354,  0.47380354, 0.47380354,  0.48092881]])

Using this example from scipy, how do I get the x array and plug this in to get my coefficients, then plot the curve_fit on my distribution?

import numpy as np
from scipy.optimize import curve_fit
from scipy.special import gamma as gamma

def betafunc(x,a,b,cst):
    return cst*gamma(a+b) * (x**(a-1)) * ((1-x)**(b-1))  / ( gamma(a)*gamma(b) )

x = np.array( [0.1, 0.3, 0.5, 0.7, 0.9, 1.1])
y = np.array( [0.45112234, 0.56934313, 0.3996803 , 0.28982859, 0.19682153, 0.] )

popt2,pcov2 = curve_fit(betafunc,x[:-1],y[:-1],p0=(0.5,1.5,0.5))

print(popt2)
print(pcov2)


from Scipy - How to fit this beta distribution using Python Scipy Curve Fit

No comments:

Post a Comment