We have a list of test stations that needs to be checked for data fetch with some interval (think each 10 minutes - but we can vary this within reasonable limits).
The way this is done is through a celery setup that post tasks (one for each station) to rabbitmq each 10 minutes (example) with a task timeout of roughly the same.
So, this seems broken by design, since once rabbitmq cannot keep up, some tasks gets timed out and effectively some test stations never gets processed.
The tasks are implemented via python and the different test stations is just one big config file (almost just IP address and a Name for each).
We could solve this by just having, for example, one program looping all the tasks instead - that would not need celery or rabbitmq, but would require a significant rewrite and we also gain some benefit from rabbitmq (ie. each task run in isolation).
So now we are wondering (having very little knowledge of rabbitmq/celery) if there is some way to improve the design within the rabbitmq/celery world so as to avoid this design flaw ?
Removing the timeout seems like it will only end up clogging up rabbitmq instead since tasks get added with set pace ?
(Ps. can provide more information/code to illustrate if needed)
from RabbitMQ starvation w/ celery design flaw - how to solve?
No comments:
Post a Comment