Other questions attempting to provide the python
equivalent to R
's sweep
function (like here) do not really address the case of multiple arguments where it is most useful.
Say I wish to apply a 2 argument function to each row of a Dataframe with the matching element from a column of another DataFrame:
df = data.frame("A" = 1:3,"B" = 11:13)
df2= data.frame("X" = 10:12,"Y" = 10000:10002)
sweep(df,1, FUN="*",df2$X)
In python
I got the equivalent using apply
on what is basically a loop through the row counts.
df = pd.DataFrame( { "A" : range(1,4),"B" : range(11,14) } )
df2 = pd.DataFrame( { "X" : range(10,13),"Y" : range(10000,10003) } )
pd.Series(range(df.shape[0])).apply(lambda row_count: np.multiply(df.iloc[row_count,:],df2.iloc[row_count,df2.columns.get_loc('X')]))
I highly doubt this is efficient in pandas
, what is a better way of doing this?
Both bits of code should result in a Dataframe/matrix of 6 numbers when applying *
:
A B
1 10 110
2 22 132
3 36 156
I should state clearly that the aim is to insert one's own function into this sweep
like behavior say:
df = data.frame("A" = 1:3,"B" = 11:13)
df2= data.frame("X" = 10:12,"Y" = 10000:10002)
myFunc = function(a,b) { floor((a + b)^min(a/2,b/3)) }
sweep(df,1, FUN=myFunc,df2$X)
resulting in:
A B
[1,] 3 4
[2,] 3 4
[3,] 3 5
What is a good way of doing that in python pandas?
from Efficient python pandas equivalent/implementation of R sweep with multiple arguments
No comments:
Post a Comment