I created a function that returns a list of urls given a specific companies name. I want to know search through this list of urls and find information on whether the company is owned by another company.
Example: The company "Marketo" was acquired by Adobe.
I want to return whether some company was acquired and by whom.
Here is what I have so far:
import requests
from googlesearch import search
from bs4 import BeautifulSoup as BS
def get_url(company_name):
url_list = []
for url in search(company_name, stop=10):
url_list.append(url)
return url_list
test1 = get_url('Marketo')
print(test1[7])
r = requests.get(test1[7])
html = r.text
soup = BS(html, 'lxml')
stuff = soup.find_all('a')
print(stuff)
I am new to web scraping and I have no idea how to really search through each URL (assuming I can) and find the information I seek.
The value of test1 is the following list:
['https://www.marketo.com/', 'https://www.marketo.com/software/marketing-automation/', 'https://blog.marketo.com/', 'https://www.marketo.com/software/', 'https://www.marketo.com/company/', 'https://www.marketo.com/solutions/pricing/', 'https://www.marketo.com/solutions/', 'https://en.wikipedia.org/wiki/Marketo', 'https://www.linkedin.com/company/marketo', 'https://www.cmswire.com/digital-marketing/what-is-marketo-a-marketers-guide/']
from Given list of websites, search and return information in Python
No comments:
Post a Comment