Hemant Vishwakarma: Queueing up workers in Dask

Tuesday 3 August 2021

Queueing up workers in Dask

I have the following scenario that I need to solve with Dask scheduler and workers:

Dask program has N functions called in a loop (N defined by the user)
Each function is started with delayed(func)(args) to run in parallel.
When each function from the previous point starts, it triggers W workers. This is how I invoke the workers:
```
futures = client.map(worker_func, worker_args)     
worker_responses = client.gather(futures)
```

That means that I need N * W workers to run everything in parallel. The problem is that this is not optimal as it's too much resource allocation, I run it on the cloud and it's expensive. Also, N is defined by the user, so I don't know beforehand how much processing capability I need to have.

Is there a way to queue up the workers in such a way that if I define that Dask has X workers, when a worker ends then the next one starts?

from Queueing up workers in Dask

Hemant Vishwakarma

Tuesday 3 August 2021

Queueing up workers in Dask

No comments:

Post a Comment