I've written a script in python in combination with selenium to download few document files (ending with .doc) from a webpage. The reason I do not wish to use requests
or urllib
module to download the files is because the website I'm currently palying with do not have any true url connected to each file. They are javascript encrypted. However, I've chosen a link within my script to mimic the same.
What my script does at this moment:
- Create a master folder in the desktop
- Create subfolders within the master folder taking the name of the files to be downloaded
- Download files initiating click on their links and put the files in master folder.
(this is what I need rectified)
How can I modify my script to download the files initiating click on their links and put the downloaded files in their concerning folders?
This is my try so far:
import os
import time
from selenium import webdriver
link ='https://www.online-convert.com/file-format/doc'
dirf = os.path.expanduser('~')
desk_location = dirf + r'\Desktop\file_folder'
if not os.path.exists(desk_location):os.mkdir(desk_location)
def download_files():
driver.get(link)
for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
filename = item.get_attribute("href").split("/")[-1]
#creating new folder in accordance with filename to store the downloaded file in thier concerning folder
folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
#set the new location of the folders to be created
new_location = os.path.join(desk_location,folder_name)
if not os.path.exists(new_location):os.mkdir(new_location)
#set the location of the folders the downloaded files will be within
file_location = os.path.join(new_location,filename)
item.click()
time_to_wait = 10
time_counter = 0
try:
while not os.path.exists(file_location):
time.sleep(1)
time_counter += 1
if time_counter > time_to_wait:break
except Exception:pass
if __name__ == '__main__':
chromeOptions = webdriver.ChromeOptions()
prefs = {'download.default_directory' : desk_location,
'profile.default_content_setting_values.automatic_downloads': 1
}
chromeOptions.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(chrome_options=chromeOptions)
download_files()
The following image represents how the downloaded files are currently stored (the files are outside of their concerning folders)
:
from Can't store downloaded files in their concerning folders
No comments:
Post a Comment