Sunday, 13 December 2020

Parse emails body in Python

I'm working with the enron dataset, and I'm interested on extract the clean body of the emails to a list keeping each answer as a string in the list. E.G.

For the following email:

Message-ID: <12626409.1075857596370.JavaMail.evans@thyme>
Date: Tue, 17 Oct 2000 10:36:00 -0700 (PDT)
From: john.arnold@enron.com
To: jenwhite7@zdnetonebox.com
Subject: Re: Hi
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: John Arnold
X-To: "Jennifer White" <jenwhite7@zdnetonebox.com> @ ENRON
X-cc: 
X-bcc: 
X-Folder: \John_Arnold_Dec2000\Notes Folders\'sent mail
X-Origin: Arnold-J
X-FileName: Jarnold.nsf

So, what is it?   And by the way, don't start with the excuses.   You're 
expected to be a full, gourmet cook.

Kisses, not music, makes cooking a more enjoyable experience.  




"Jennifer White" <jenwhite7@zdnetonebox.com> on 10/17/2000 04:19:20 PM
To: jarnold@enron.com
cc:  
Subject: Hi


I told you I have a long email address.

I've decided what to prepare for dinner tomorrow.  I hope you aren't
expecting anything extravagant because my culinary skills haven't been
put to use in a while.  My only request is that your stereo works.  Music
makes cooking a more enjoyable experience.

Watch the debate if you are home tonight.  I want a report tomorrow...
Jen

___________________________________________________________________
To get your own FREE ZDNet Onebox - FREE voicemail, email, and fax,
all in one place - sign up today at http://www.zdnetonebox.com

I want to get the following response:

["So what is it?   And by the way  don't start with the excuses.   You're 
expected to be a full  gourmet cook. Kisses  not music  makes cooking a more enjoyable experience.", 
"I told you I have a long email address. I've decided what to prepare for dinner tomorrow.  I hope you aren't 
expecting anything extravagant because my culinary skills haven't been
put to use in a while.  My only request is that your stereo works.  Music
makes cooking a more enjoyable experience. Watch the debate if you are home tonight.  I want a report tomorrow...
Jen"]

Where the first element in the list is:

"So what is it?   And by the way  don't start with the excuses.   You're 
expected to be a full  gourmet cook. Kisses  not music  makes cooking a more enjoyable experience."

Is there a library capable of doing this?

I have tried with the python email library, but I does not seem to have that functionality, since I get the full body as response:

import email
message = data_
e = email.message_from_string(message)
print (e.get_payload())

So, what is it? And by the way, don't start with the excuses.
You're \nexpected to be a full, gourmet cook.\n\nKisses, not music, makes cooking a more enjoyable experience. \n\n\n\n\n"Jennifer White" jenwhite7@zdnetonebox.com on 10/17/2000 04:19:20 PM\nTo: jarnold@enron.com\ncc: \nSubject: Hi\n\n\nI told you I have a long email address.\n\nI've decided what to prepare for dinner tomorrow. I hope you aren't\nexpecting anything extravagant because my culinary skills haven't been\nput to use in a while. My only request is that your stereo works. Music\nmakes cooking a more enjoyable experience.\n\nWatch the debate if you are home tonight. I want a report tomorrow...\nJen\n\n___________________________________________________________________\nTo get your own FREE ZDNet Onebox - FREE voicemail, email, and fax,\nall in one place - sign up today at http://www.zdnetonebox.com\n\n\n'



from Parse emails body in Python

No comments:

Post a Comment