Hemant Vishwakarma: Why is Python Script to Download .xlsx from Sharepoint Failing only for Some URLs?

Tuesday, 23 April 2019

Why is Python Script to Download .xlsx from Sharepoint Failing only for Some URLs?

Using the Python Office365-REST-Python-Client I have written the following Python function to download Excel spreadsheets from Sharepoint (based on the answer at How to read SharePoint Online (Office365) Excel files in Python with Work or School Account? )

import sys
from urlparse import urlparse
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File

xmlErrText = "<?xml version=\"1.0\" encoding=\"utf-8\"?><m:error"

def download(sourceURL, destPath, username, password):
    print "Download URL:  {}".format(sourceURL)
    urlParts = urlparse(sourceURL)
    baseURL = urlParts.scheme + "://" + urlParts.netloc
    relativeURL = urlParts.path
    if len(urlParts.query):
        relativeURL = relativeURL + "?" + urlParts.query

    ctx_auth = AuthenticationContext(baseURL)
    if ctx_auth.acquire_token_for_user(username, password):
        try:
            ctx = ClientContext(baseURL, ctx_auth)
            web = ctx.web
            ctx.load(web)
            ctx.execute_query()
        except:
            print "Failed to execute Sharepoint query (possibly bad username/password?)"
            return False
        print "Logged into Sharepoint: {0}".format(web.properties['Title'])
        response = File.open_binary(ctx, relativeURL)
        if response.content.startswith(xmlErrText):
            print "ERROR response document received.  Possibly permissions or wrong URL?  Document content follows:\n\n{}\n".format(response.content)
            return False
        else:
            with open(destPath, 'wb') as f:
                f.write(response.content)
                print "Downloaded to:  {}".format(destPath)
    else:
        print ctx_auth.get_last_error()
        return False
    return True

This function works fine for some URLs but fails for others, printing the following "file does not exist" document content on failure (newlines and whitespace added for readability):

<?xml version="1.0" encoding="utf-8"?>
<m:error xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
    <m:code>
        -2130575338, Microsoft.SharePoint.SPException
    </m:code>
    <m:message xml:lang="en-US">
        The file /sites/path/to/document.xlsx does not exist.
    </m:message>
</m:error>

I know that the username and password are correct. Indeed changing the password results in a completely different error.

I have found that this error can occur when either the document doesn't exist, or when there are insufficient permissions to access the document.

However, using the same username/password, I can download the document with the same URL in a web browser.

Note that this same function consistently works fine for some .xlsx URLs in the same Sharepoint repository, but consistently fails for some other .xlsx URLs in that same Sharepoint repository.

My only guess is that there are some more fine-grained permissions that need to me managed. But I'm completely ignorant to these if they exist.

Can anybody help me to resolve why the failure is occurring and figure out how to get it working for all the required files that I can already download in a web browser?

from Why is Python Script to Download .xlsx from Sharepoint Failing only for Some URLs?

Hemant Vishwakarma

Tuesday, 23 April 2019

Why is Python Script to Download .xlsx from Sharepoint Failing only for Some URLs?

No comments:

Post a Comment