My code looks something like this:
from joblib import Parallel, delayed
# prediction model - 10s of megabytes on disk
LARGE_MODEL = load_model('path/to/model')
file_paths = glob('path/to/files/*')
def do_thing(file_path):
pred = LARGE_MODEL.predict(load_image(file_path))
return pred
Parallel(n_jobs=2)(delayed(do_thing)(fp) for fp in file_paths)
My question is whether LARGE_MODEL
will be pickled/unpickled with each iteration of the loop. And if so, how can I make sure each worker caches it instead (if that's possible)?
from How does joblib.Parallel deal with global variables?
No comments:
Post a Comment