Thursday, 12 July 2018

Extract Data Dump From Freebase in Python

With the Data Dump Freebase Triples (freebase-rdf-latest.gz) downloaded from website, What would be the optimal process to open and read this file in order to extract information, let's say relative info about companies and businesses? (In Python)

As far as I've gone, there are some packages to accomplish this target: open gz file in python and read a rdf file, Im not sure how to accomplish this...

My Failed Attempt in python 3.6:

import gzip

with gzip.open('freebase-rdf-latest.gz','r') as uncompressed_file:
       for line in uncompressed_file.read():
           print(line)

After that with the xml structure I could get the info by parsing it, but I cannot read the file.



from Extract Data Dump From Freebase in Python

No comments:

Post a Comment