Saturday, 8 October 2022

NotImplementedError when calling pandas_profiling.ProfileReport.to_widgets() inside Apache Zeppelin

I'm trying to use the pandas_profiling package to automagically describe some data frames from inside Apaceh Zeppelin.

The code I'm running is:

%pyspark

import sys
print(sys.version_info)

import numpy as np
print("numpy: ", np.__version__)
import pandas as pd
print("pandas: ", pd.__version__)
import pandas_profiling as pp
print("pandas_profiling: ", pp.__version__)

from pandas_profiling import ProfileReport

df = spark.sql("SELECT * FROM database.table")

profile = ProfileReport(df, title="Report: table")

profile.to_widgets()

My result is:

sys.version_info(major=3, minor=6, micro=8, releaselevel='final', serial=0)
numpy:  1.19.5
pandas:  1.1.5
pandas_profiling:  3.1.0


Fail to execute line 19: profile.to_widgets()
Traceback (most recent call last):
  File "/tmp/1662648724242-0/zeppelin_python.py", line 158, in <module>
    exec(code, _zcUserQueryNameSpace)
  File "<stdin>", line 19, in <module>
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 414, in to_widgets
    display(self.widgets)
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 197, in widgets
    self._widgets = self._render_widgets()
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 315, in _render_widgets
    report = self.report
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 179, in report
    self._report = get_report_structure(self.config, self.description_set)
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 166, in description_set
    self._sample,
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/model/describe.py", line 56, in describe
    check_dataframe(df)
  File "/usr/local/lib/python3.6/site-packages/multimethod/__init__.py", line 209, in __call__
    return self[tuple(map(self.get_type, args))](*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pandas_profiling/model/dataframe.py", line 10, in check_dataframe
    raise NotImplementedError()
NotImplementedError

Any way to work around this? Any hope of working around it from inside Zeppelin?



from NotImplementedError when calling pandas_profiling.ProfileReport.to_widgets() inside Apache Zeppelin

No comments:

Post a Comment