I'm trying to run a PyTorch model in a Django app. As it is not recommended to execute the models (or any long-running task) in the views, I decided to run it in a Celery task. My model is quite big and it takes about 12 seconds to load and about 3 seconds to infer. That's why I decided that I couldn't afford to load it at every request. So I tried to load it at settings and save it there for the app to use it. So my final scheme is:
- When the Django app starts, in the settings the PyTorch model is loaded and it's accessible from the app.
- When views.py receives a request, it delays a celery task
- The celery task uses the settings.model to infer the result
The problem here is that the celery task throws the following error when trying to use the model
[2020-08-29 09:03:04,015: ERROR/ForkPoolWorker-1] Task app.tasks.task[458934d4-ea03-4bc9-8dcd-77e4c3a9caec] raised unexpected: RuntimeError("Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method")
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/tensor/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensor/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
/*...*/
File "/home/ubuntu/anaconda3/envs/tensor/lib/python3.7/site-packages/torch/cuda/__init__.py", line 191, in _lazy_init
"Cannot re-initialize CUDA in forked subprocess. " + msg)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Here's the code in my settings.py loading the model:
if sys.argv and sys.argv[0].endswith('celery') and 'worker' in sys.argv: #In order to load only for the celery worker
import torch
torch.cuda.init()
torch.backends.cudnn.benchmark = True
load_model_file()
And the task code
@task
def getResult(name):
print("Executing on GPU:", torch.cuda.is_available())
if os.path.isfile(name):
try:
outpath = model_inference(name)
os.remove(name)
return outpath
except OSError as e:
print("Error", name, "doesn't exist")
return ""
The print in the task shows "Executing on GPU: true"
I've tried setting torch.multiprocessing.set_start_method('spawn')
in the settings.py before and after the torch.cuda.init()
but it gives the same error.
from Using PyTorch with Celery
No comments:
Post a Comment