Saturday, 24 April 2021

Scipy or bayesian optimize function with constraints, bounds and dataframe in python

With the dataframe underneath I want to optimize the total return, while certain bounds are satisfied.

d = {'Win':[0,0,1, 0, 0, 1, 0],'Men':[0,1,0, 1, 1, 0, 0], 'Women':[1,0,1, 0, 0, 1,1],'Matches' :[0,5,4, 7, 4, 10,13],
     'Odds':[1.58,3.8,1.95, 1.95, 1.62, 1.8, 2.1], 'investment':[0,0,6, 10, 5, 25,0],}

data = pd.DataFrame(d)

I want to maximize the following equation:

totalreturn = np.sum(data['Odds'] * data['investment'] * (data['Win'] == 1))

The function should be maximized satisfying the following bounds:

for i in range(len(data)):
    
    investment = data['investment'][i]
    
    C = alpha0 + alpha1*data['Men'] + alpha2 * data['Women'] + alpha3 * data['Matches']
    
    if (lb < investment ) & (investment < ub) & (investment > C) == False:
        data['investment'][i] = 0

Hereby lb and ub are constant for every row in the dataframe. Threshold C however, is different for every row. Thus there are 6 parameters to be optimized: lb, ub, alph0, alpha1, alpha2, alpha3.

Can anyone tell me how to do this in python? My proceedings so far have been with scipy (Approach1) and Bayesian (Approach2) optimization and only lb and ub are tried to be optimized. Approach1:

import pandas as pd
from scipy.optimize import minimize

def objective(val, data):
    
    # Approach 1
    # Lowerbound and upperbound
    lb, ub = val
    
    # investments
    # These matches/bets are selected to put wager on
    tf1 = (data['investment'] > lb) & (data['investment'] < ub) 
    data.loc[~tf1, 'investment'] = 0
    
        
    # Total investment
    totalinvestment = sum(data['investment'])
    
    # Good placed bets 
    data['reward'] = data['Odds'] * data['investment'] * (data['Win'] == 1)
    totalreward = sum(data['reward'])

    # Return and cumalative return
    data['return'] = data['reward'] - data['investment']
    totalreturn = sum(data['return'])
    data['Cum return'] = data['return'].cumsum()
    
    # Return on investment
    print('\n',)
    print('lb, ub:', lb, ub)
    print('TotalReturn: ',totalreturn)
    print('TotalInvestment: ', totalinvestment)
    print('TotalReward: ', totalreward)
    print('# of bets', (data['investment'] != 0).sum())
          
    return totalreturn
          

# Bounds and contraints
b = (0,100)
bnds = (b,b,)
x0 = [0,100]

sol = minimize(objective, x0, args = (data,), method = 'Nelder-Mead', bounds = bnds)

and approach2:

import pandas as pd
import time
import pickle
from hyperopt import fmin, tpe, Trials
from hyperopt import STATUS_OK
from hyperopt import  hp

def objective(args):
    # Approach2

    # Lowerbound and upperbound
    lb, ub = args
    
    # investments
    # These matches/bets are selected to put wager on
    tf1 = (data['investment'] > lb) & (data['investment'] < ub) 
    data.loc[~tf1, 'investment'] = 0
    
        
    # Total investment
    totalinvestment = sum(data['investment'])
    
    # Good placed bets 
    data['reward'] = data['Odds'] * data['investment'] * (data['Win'] == 1)
    totalreward = sum(data['reward'])

    # Return and cumalative return
    data['return'] = data['reward'] - data['investment']
    totalreturn = sum(data['return'])
    data['Cum return'] = data['return'].cumsum()
    
    # store results
    d = {'loss': - totalreturn, 'status': STATUS_OK, 'eval time': time.time(),
    'other stuff': {'type': None, 'value': [0, 1, 2]},
    'attachments': {'time_module': pickle.dumps(time.time)}}
    
    return d

          

trials = Trials()

parameter_space  = [hp.uniform('lb', 0, 100), hp.uniform('ub', 0, 100)]

best = fmin(objective,
    space= parameter_space,
    algo=tpe.suggest,
    max_evals=500,
    trials = trials)


print('\n', trials.best_trial)

Anyone knows how I should proceed? Scipy doesn't generate the desired result. Hyperopt optimization does result in the desired result. In either approach I don't know how to incorporate a boundary that is row depended (C(i)).

Anything would help! (Any relative articles, exercises or helpful explanations about the sort of optimization are also more than welcome)



from Scipy or bayesian optimize function with constraints, bounds and dataframe in python

No comments:

Post a Comment