Sunday, 24 January 2021

How to check if a URL is downloadable in requests

I am making this downloader app using tkinter and requests and I recently found a bug in my program. Basically I want my program to check whether the given URL is downloadable or not before starting the download of the content of the URL. I used to do this by getting the headers of the URL and checking if 'Content-Length' exists and it works for some URLs (like: https://www.google.com) but for the others (like the link to a youtube video) it does not and it makes my program crash. I saw that someone said one stackoverflow that I could check for 'attachment' in 'Content-Disposition' of the headers but it didn't work for me and returned the same thing for a downloadable and a non-downloadable URL. What is the best way to do this? The code mentioned in the other stackoverflow issue that I tried and did not work:

import requests
url = 'https://www.google.com'
headers=requests.head(url).headers
downloadable = 'attachment' in headers.get('Content-Disposition', '')

My former code:

headers = requests.head(url, headers={'accept-encoding': ''}).headers
try:
    print(type(headers['Content-Length']))
    file_size = int(headers['Content-Length'])
except KeyError:
    # Just a class that I defined to raise an exception if the URL was not downloadable
    raise NotDownloadable()

UPDATE: URL: https://aspb1.cdn.asset.aparat.com/aparat-video/a5e07b7f62ffaad0c104763c23d7393215613675-360p.mp4?wmsAuthSign=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0b2tlbiI6IjUzMGU0Mzc3ZjRlZjVlYWU0OTFkMzdiOTZkODgwNGQ2IiwiZXhwIjoxNjExMzMzMDQxLCJpc3MiOiJTYWJhIElkZWEgR1NJRyJ9.FjMi_dkdLCUkt25dfGqPLcehpaC32dBBUNDC9cLNiu0 This URL is the one I used for testing. If you open the URL it directly leads you to a video which you can download but when checking for the 'Content-Disposition' it returned 'None' just like the majority of the downloadable and non-downloadable URLs I have tried.



from How to check if a URL is downloadable in requests

No comments:

Post a Comment