Saturday, 2 April 2022

Selenium raise TimeoutException (Heroku, Python, FastAPI, Celery)

I built a scraper that collects data from a page, formats it and adds it to a database. It then uses the secraped data to build models, except for one value that it scrapes. Everything is wrapped in Celery so that tasks run in the background.

@router.post("/run/{id}")
async def create(id: str):
    wallet_reputation.delay(id)

    return {"Status": "Task successfully add to execute"}

Endpoint above works fine, everything is ok. The ID value that is added in the above endpoint is unique and there are about 100 such values. In order to automate building a model for each ID I made such an endpoint to call it from time to time (secrap data changes, hence I need to update my models).

@router.post("/run")
async def create_all():
    for address in all_addresses_generator():
        wallet_reputation.delay(address)

    return {"Status": "Tasks successfully add to execute"}

I recive that error

2022-03-26T15:25:52.051854+00:00 heroku[worker.1]: Process running mem=543M(104.1%)
2022-03-26T15:25:52.073256+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2022-03-26T15:26:02.875701+00:00 app[worker.1]: [2022-03-26 15:26:02,871: ERROR/ForkPoolWorker-8] Task walletReputation[2cca3c3e-8c58-4983-bbae-e55e52f33c1a] raised unexpected: TimeoutException('', None, ['#0 0x556bcd4bc7d3 <unknown>', '#1 0x556bcd218688 <unknown>', '#2 0x556bcd24ec21 <unknown>', '#3 0x556bcd24ede1 <unknown>', '#4 0x556bcd281d74 <unknown>', '#5 0x556bcd26c6dd <unknown>', '#6 0x556bcd27fa0c <unknown>', '#7 0x556bcd26c5a3 <unknown>', '#8 0x556bcd241ddc <unknown>', '#9 0x556bcd242de5 <unknown>', '#10 0x556bcd4ed49d <unknown>', '#11 0x556bcd50660c <unknown>', '#12 0x556bcd4ef205 <unknown>', '#13 0x556bcd506ee5 <unknown>', '#14 0x556bcd4e3070 <unknown>', '#15 0x556bcd522488 <unknown>', '#16 0x556bcd52260c <unknown>', '#17 0x556bcd53bc6d <unknown>', '#18 0x7f8e32957609 <unknown>', ''])
2022-03-26T15:26:02.875723+00:00 app[worker.1]: Traceback (most recent call last):
2022-03-26T15:26:02.875724+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
2022-03-26T15:26:02.875724+00:00 app[worker.1]:     R = retval = fun(*args, **kwargs)
2022-03-26T15:26:02.875724+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
2022-03-26T15:26:02.875725+00:00 app[worker.1]:     return self.run(*args, **kwargs)
2022-03-26T15:26:02.875725+00:00 app[worker.1]:   File "/app/tasks.py", line 40, in wallet_reputation
2022-03-26T15:26:02.875725+00:00 app[worker.1]:     WalletReputation(id).add_reputation_to_db()
2022-03-26T15:26:02.875727+00:00 app[worker.1]:   File "/app/agents/walletReputation.py", line 261, in add_reputation_to_db
2022-03-26T15:26:02.875727+00:00 app[worker.1]:     nc_balance=self.nc_balance(),
2022-03-26T15:26:02.875727+00:00 app[worker.1]:   File "/app/agents/walletReputation.py", line 162, in nc_balance
2022-03-26T15:26:02.875727+00:00 app[worker.1]:     WebDriverWait(self.driver, 20)
2022-03-26T15:26:02.875727+00:00 app[worker.1]:   File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 89, in until
2022-03-26T15:26:02.875728+00:00 app[worker.1]:     raise TimeoutException(message, screen, stacktrace)
2022-03-26T15:26:02.875728+00:00 app[worker.1]: selenium.common.exceptions.TimeoutException: Message: 
2022-03-26T15:26:02.875729+00:00 app[worker.1]: Stacktrace:
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #0 0x556bcd4bc7d3 <unknown>
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #1 0x556bcd218688 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #2 0x556bcd24ec21 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #3 0x556bcd24ede1 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #4 0x556bcd281d74 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #5 0x556bcd26c6dd <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #6 0x556bcd27fa0c <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #7 0x556bcd26c5a3 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #8 0x556bcd241ddc <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #9 0x556bcd242de5 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #10 0x556bcd4ed49d <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #11 0x556bcd50660c <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #12 0x556bcd4ef205 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #13 0x556bcd506ee5 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #14 0x556bcd4e3070 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #15 0x556bcd522488 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #16 0x556bcd52260c <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #17 0x556bcd53bc6d <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #18 0x7f8e32957609 <unknown>

I don't understand why I suddenly get an error if the previous endpoint that performs the same task in Celery works normally. Below, I paste the code of the generator and class method, on which the error pops up.

def all_addresses_generator():
    for row in session.query(DbNcTransaction).all():
        yield row.to
def nc_balance(self):
    base_url = "https://polygonscan.com/token/0x64a795562b02830ea4e43992e761c96d208fc58d?a="
        self.driver.get(base_url + self.address)

    nc_balance = (
        WebDriverWait(self.driver, 20)
            .until(
                EC.presence_of_element_located(
                    (By.CSS_SELECTOR, "#ContentPlaceHolder1_divFilteredHolderBalance")
                )
            )
            .text
    )

    nc_balance = nc_balance.split()[1]
    nc_balance = round(float(nc_balance.replace(",", "")), 2)

    return nc_balance

How can I deal with this?



from Selenium raise TimeoutException (Heroku, Python, FastAPI, Celery)

No comments:

Post a Comment