I've created a script using scrapy to parse the content from a website. The script is doing fine. However, I want that spider to retry in some cases and which is why I created a retry middleware.
I tried to understand why this portion or response is in place within process_response() in this line return self._retry(request, reason, spider) or response as I want this very method to retry, not to return response within that block.
This is my current approach:
def _retry(self, request, reason, spider):
check_url = request.url
r = request.copy()
r.dont_filter = True
return r
def process_response(self, request, response, spider):
if request.meta.get('dont_retry', False):
return response
if ("some_redirected_url" in response.url) and (response.status in RETRY_HTTP_CODES):
reason = response_status_message(response.status)
return self._retry(request, reason, spider) or response
return response
from Can't figure out what response `process_response()` method produces when this portion `or response` is valid
No comments:
Post a Comment