Thursday 19 July 2018

pandas read_html clean up before or after read

I'm trying to get the last table in this html into a data table.

Here is the code:

import pandas as pd
a=pd.read_html('https://www.sec.gov/Archives/edgar/data/1303652/000130365218000016/a991-01q12018.htm')
print (a[23])

As you can see it reads it in, but needs to be cleaned up. My question is for someone who has experience with using this function. Is it better to read it in and then try to clean it up afterwards or before? And if anybody knows how to do it, please post some code. Thanks.



from pandas read_html clean up before or after read

No comments:

Post a Comment