Hemant Vishwakarma: How does joblib.Parallel deal with global variables?

Monday, 23 November 2020

How does joblib.Parallel deal with global variables?

My code looks something like this:

from joblib import Parallel, delayed

# prediction model - 10s of megabytes on disk
LARGE_MODEL = load_model('path/to/model')

file_paths = glob('path/to/files/*')

def do_thing(file_path):
  pred = LARGE_MODEL.predict(load_image(file_path))
  return pred

Parallel(n_jobs=2)(delayed(do_thing)(fp) for fp in file_paths)

My question is whether LARGE_MODEL will be pickled/unpickled with each iteration of the loop. And if so, how can I make sure each worker caches it instead (if that's possible)?

from How does joblib.Parallel deal with global variables?

Hemant Vishwakarma

Monday, 23 November 2020

How does joblib.Parallel deal with global variables?

No comments:

Post a Comment