Friday, 25 September 2020

Unable to get rid of some error raised by process_exception

I'm trying not to show/get some error thrown by scrapy within process_response in RetryMiddleware. The error the script encounters when max retry limit is crossed. I used proxies within middleware. The weird thing is that the exception the script throws is already within the EXCEPTIONS_TO_RETRY list. It is completely okay that the script may sometimes cross the number of max retries without any success. However, I just do not wish to see that error even when it is there, meaning suppress or bypass it.

The error is like:

Traceback (most recent call last):
  File "middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.internet.error.TCPTimedOutError: TCP connection timed out: 10060: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond..

This is how process_response within RetryMiddleware looks like:

class RetryMiddleware(object):
    cus_retry = 3
    EXCEPTIONS_TO_RETRY = (defer.TimeoutError, TimeoutError, DNSLookupError, \
        ConnectionRefusedError, ConnectionDone, ConnectError, \
        ConnectionLost, TCPTimedOutError, TunnelError, ResponseFailed)

    def process_exception(self, request, exception, spider):
        if isinstance(exception, self.EXCEPTIONS_TO_RETRY) \
                and not request.meta.get('dont_retry', False):
            return self._retry(request, exception, spider)

    def _retry(self, request, reason, spider):
        retries = request.meta.get('cus_retry',0) + 1
        if retries<=self.cus_retry:
            r = request.copy()
            r.meta['cus_retry'] = retries
            r.meta['proxy'] = f'https://{ip:port}'
            r.dont_filter = True
            return r
        else:
            print("done retrying")

How can I get rid of the errors in EXCEPTIONS_TO_RETRY?



from Unable to get rid of some error raised by process_exception

No comments:

Post a Comment