Tuesday, 5 June 2018

Pandas: Group by a column that meets a condition

I have a data set with three colums: rating , breed, and dog.

import pandas as pd
dogs = {'breed': ['Chihuahua', 'Chihuahua', 'Dalmatian', 'Sphynx'],
        'dog': [True, True, True, False],
        'rating': [8.0, 9.0, 10.0, 7.0]}

df = pd.DataFrame(data=dogs)

I would like to calculate the mean rating per breed where dog is True. This would be the expected:

  breed     rating
0 Chihuahua 8.5   
1 Dalmatian 10.0  

This has been my attempt:

df.groupby('breed')['rating'].mean().where(dog == True)

And this is the error that I get:

NameError: name 'dog' is not defined

But when I try add the where condition I only get errors. Can anyone advise a solution? TIA



from Pandas: Group by a column that meets a condition

No comments:

Post a Comment