This is starting to bug me: In plotly express when using animation_frame, I know it's important to set ranges so data can be displayed consistently, otherwise data may vanish across frames. But for a column with categorical values (say 'US', 'Russia', 'Germany'), I cannot find any way to avoid disappearing data when not every frame contains all categories if I want that column to appear with different colors (in the code below, that column would be 'AnotherColumn'). Plotly documentation points out
Animations are designed to work well when each row of input is present across all animation frames, and when categorical values mapped to symbol, color and facet are constant across frames. Animations may be misleading or inconsistent if these constraints are not met.
but while I can easily set a range_color when I have a continuous color range, nothing of the sort seems to work for categorical data. I can somewhat workaround this by making my data numerical (e.g. 'US'-> 1, 'Russia' -> 2) bu that is both fiddly and the result visually unappealing.
import plotly.express as px
...
fig = px.bar(data, x="NameColumn",
y="SomeColumn",
color="AnotherColumn",
animation_frame="AnimationColumn",
range_y=[0, max_y]
)
Here is a simple reproducible example:
import pandas as pd
import plotly.express as px
data_dict = {'ColorColumn': ['p', 'p', 'p', 'q'],
'xColumn': ['someName', 'someOtherName', 'someName', 'someOtherName'],
'yColumn': [10, 20, 30, 40],
'animationColumn': [1, 1, 2, 2]}
data = pd.DataFrame(data=data_dict)
fig = px.bar(data, x="xColumn",
y="yColumn",
color="ColorColumn",
animation_frame="animationColumn",
range_y=[0, 40]
)
fig.update_layout(xaxis={'title': '',
'visible': True,
'showticklabels': True})
fig.show()
If you try it out, you'll notice the second frame is missing a bar. If the ColorColumn had numeric data, you could fix this by specifying range_color (similar to the specification of range_y in the code above); my question would be, how to handle this with categorical data?
Second edit: Some requested additional data or more a more reasonable example. This might be more appropriate:
import pandas as pd
import plotly.express as px
data_dict = {'Region': ['North America', 'Asia', 'Asia',
'North America', 'Asia', 'Europe',
'North America', 'Europe', 'Asia'],
'Country': ['US', 'China', 'Korea',
'US', 'Phillipines', 'France',
'Canada', 'Germany', 'Thailand'],
'GDP': [10, 20, 30,
40, 50, 60,
70, 80, 90],
'Year': [2017, 2017, 2017,
2018, 2018, 2018,
2019, 2019, 2019]}
data = pd.DataFrame(data=data_dict)
fig = px.bar(data, x="Country",
y="GDP",
color="Region",
animation_frame="Year",
range_y=[0, 80]
)
fig.update_layout(xaxis={'title': '',
'visible': True,
'showticklabels': True})
fig.show()
from Making data consistent across animation frames (i.e. avoiding vanishing data) in plotly express
No comments:
Post a Comment