Friday, 8 September 2023

How to Plot GeoJSON Geometry Data in Streamlit Using Plotly Express Choropleth Mapbox?

I'm trying to plot GeoJSON data of Indonesian cities and provinces using Streamlit and Plotly Express. My data comes from a DuckDB database, and I merge it with some additional data before plotting. I convert the merged data to GeoJSON format for plotting, but the map shows up empty.

What I've Tried:

  1. Checked that the GeoJSON and DataFrame indices match.
  2. Used featureidkey in px.choropleth_mapbox.
  3. Verified that the data types for the index and GeoJSON id match.

Here's the snippet related to plotting:

class InteractiveMap:
    def __call__(self):
        return self.interactive_map()

    def interactive_map(self):
        st.title('Interactive Map of Indonesia')

        choice = st.selectbox("Choose between Cities and Provinces", ["Cities", "Provinces"])

        conn = duckdb.connect('oeroenremboog.db')
        cursor = conn.cursor()

        if choice == "Cities":
            cursor.execute("SELECT * FROM indonesia_cities;")
        else:
            cursor.execute("SELECT * FROM indonesia_provinces;")

        columns = [desc[0] for desc in cursor.description]
        indonesia_map = pd.DataFrame(cursor.fetchall(), columns=columns)

        if 'geometry' in indonesia_map.columns:
            indonesia_map['geometry'] = indonesia_map['geometry'].apply(wkb.loads, hex=True)

        indonesia_map = gpd.GeoDataFrame(indonesia_map, geometry='geometry')

        # Load the JSON data
        with open('src/data/diff_percentage_dm1_dm2.json') as f:
            diff_data = json.load(f)
        diff_df = pd.DataFrame(diff_data)

        name_column = 'Name' if choice == "Cities" else 'Propinsi'

        if name_column in indonesia_map.columns and 'kabupaten_tinggal' in diff_df.columns:
            merged_data = pd.merge(indonesia_map, diff_df, left_on=name_column, right_on='kabupaten_tinggal', how='left')
        else:
            st.error(f"Missing column {name_column} in either of the DataFrames.")
            return

        # Convert the 'diff_percentage' column to numeric
        merged_data['diff_percentage'] = pd.to_numeric(merged_data['diff_percentage'], errors='coerce')
        merged_data['diff_percentage'].fillna(0, inplace=True)

        indonesia_map.crs = "EPSG:4326"


        # Convert GeoDataFrame to GeoJSON
        geojson_data = json.loads(merged_data.to_json())

        # Debugging
        st.write(f"Sample GeoJSON: {str(geojson_data)[:500]}")
        st.write(f"Sample merged_data: {str(merged_data)[:500]}")
        
        # Plotting
        fig = px.choropleth_mapbox(merged_data, 
                                   geojson=geojson_data, 
                                   locations=merged_data.index,  # DataFrame index
                                   color='diff_percentage',
                                   color_continuous_scale="Viridis",
                                   range_color=(-100, 100),
                                   mapbox_style="carto-positron",
                                   opacity=0.5, 
                                   labels={'diff_percentage':'Difference Percentage'},
                                   center={"lat": -2, "lon": 118},
                                   zoom=3.4,
                                   featureidkey="properties.id")

        st.plotly_chart(fig)

        cursor.close()

        return merged_data

Debugging Info:

  • Sample GeoJSON: (a snippet of the GeoJSON data)
  • Unique Locations: (unique locations from the DataFrame)
  • Unique diff_percentage: (unique values in the diff_percentage column)

Here's a sample of the first 10 rows from my indonesia_map DataFrame converted to dictionary format:

Sample indonesia_map (first 10 rows as dict): {'Name': {0: 'SIMEULUE', 1: 'ACEH SINGKIL', 2: 'ACEH SELATAN', 3: 'ACEH TENGGARA', 4: 'ACEH TIMUR', 5: 'ACEH TENGAH', 6: 'ACEH BARAT', 7: 'ACEH BESAR', 8: 'PIDIE', 9: 'BIREUEN'}, 'latitude': {0: 2.613334894180298, 1: 2.349949598312378, 2: 3.1632587909698486, 3: 3.369655132293701, 4: 4.628895282745361, 5: 4.530141830444336, 6: 4.456692218780518, 7: 5.3799920082092285, 8: 5.068343639373779, 9: 5.093278884887695}, 'longitude': {0: 96.08564758300781, 1: 97.84710693359375, 2: 97.43519592285156, 3: 97.69552612304688, 4: 97.62864685058594, 5: 96.85894012451172, 6: 96.18546295166016, 7: 95.51558685302734, 8: 96.00715637207031, 9: 96.60938262939453}, 'geometry': {0: <POINT (96.086 2.613)>, 1: <POINT (97.847 2.35)>, 2: <POINT (97.435 3.163)>, 3: <POINT (97.696 3.37)>, 4: <POINT (97.629 4.629)>, 5: <POINT (96.859 4.53)>, 6: <POINT (96.185 4.457)>, 7: <POINT (95.516 5.38)>, 8: <POINT (96.007 5.068)>, 9: <POINT (96.609 5.093)>}}

Sample merged_data (first 10 rows as dict): {'Name': {0: 'SIMEULUE', 1: 'ACEH SINGKIL', 2: 'ACEH SELATAN', 3: 'ACEH TENGGARA', 4: 'ACEH TIMUR', 5: 'ACEH TENGAH', 6: 'ACEH BARAT', 7: 'ACEH BESAR', 8: 'PIDIE', 9: 'BIREUEN'}, 'latitude': {0: 2.613334894180298, 1: 2.349949598312378, 2: 3.1632587909698486, 3: 3.369655132293701, 4: 4.628895282745361, 5: 4.530141830444336, 6: 4.456692218780518, 7: 5.3799920082092285, 8: 5.068343639373779, 9: 5.093278884887695}, 'longitude': {0: 96.08564758300781, 1: 97.84710693359375, 2: 97.43519592285156, 3: 97.69552612304688, 4: 97.62864685058594, 5: 96.85894012451172, 6: 96.18546295166016, 7: 95.51558685302734, 8: 96.00715637207031, 9: 96.60938262939453}, 'geometry': {0: <POINT (96.086 2.613)>, 1: <POINT (97.847 2.35)>, 2: <POINT (97.435 3.163)>, 3: <POINT (97.696 3.37)>, 4: <POINT (97.629 4.629)>, 5: <POINT (96.859 4.53)>, 6: <POINT (96.185 4.457)>, 7: <POINT (95.516 5.38)>, 8: <POINT (96.007 5.068)>, 9: <POINT (96.609 5.093)>}, 'kabupaten_tinggal': {0: 'SIMEULUE', 1: 'ACEH SINGKIL', 2: 'ACEH SELATAN', 3: 'ACEH TENGGARA', 4: 'ACEH TIMUR', 5: 'ACEH TENGAH', 6: 'ACEH BARAT', 7: 'ACEH BESAR', 8: 'PIDIE', 9: 'BIREUEN'}, 'total_dm_tipe_i': {0: '236', 1: '165', 2: '464', 3: '194', 4: '117', 5: '231', 6: '167', 7: '1216', 8: '221', 9: '200'}, 'total_dm_tipe_ii': {0: '180', 1: '321', 2: '1076', 3: '256', 4: '447', 5: '343', 6: '916', 7: '1788', 8: '731', 9: '706'}, 'diff_percentage': {0: 0.1346153846153846, 1: -0.3209876543209876, 2: -0.3974025974025974, 3: -0.1377777777777777, 4: -0.5851063829787234, 5: -0.1951219512195122, 6: -0.6915974145891042, 7: -0.1904127829560586, 8: -0.5357142857142857, 9: -0.5584988962472405}}

Sample GeoJSON (first 500 characters): {'type': 'FeatureCollection', 'features': [{'id': '0', 'type': 'Feature', 'properties': {'Name': 'SIMEULUE', 'latitude': 2.613334894180298, 'longitude': 96.08564758300781, 'kabupaten_tinggal': 'SIMEULUE', 'total_dm_tipe_i': '236', 'total_dm_tipe_ii': '180', 'diff_percentage': 0.1346153846153846}, 'geometry': {'type': 'Point', 'coordinates': [96.08564793225175, 2.6133349583186596]}}, {'id': '1', 'type': 'Feature', 'properties': {'Name': 'ACEH SINGKIL', 'latitude': 2.349949598312378, 'longitude':

Error: The map shows up, but it's empty—no data is plotted. enter image description here

How can I correctly plot GeoJSON geometry data using Streamlit and Plotly Express?



from How to Plot GeoJSON Geometry Data in Streamlit Using Plotly Express Choropleth Mapbox?

No comments:

Post a Comment