Thursday, 21 January 2021

Is there a recommended way to mock AWS Batch for integration tests in Metaflow?

I would like to implement integration tests featuring Metaflow flows; i.e. running a flow from start to finish within a Docker container; and ideally this wouldn't require substantial rewriting of the flows which contain @batch decorators on specific steps.

On the s3 side I can achieve this by setting up a local s3 mocking server (e.g. s3ninja), however for AWS Batch there isn't in equivalent paradigm. I wonder, how do other people approach this problem, other than by declaring @resources instead of @batch?

One solution I have thought up is to patch @batch as so:

# my_batch.py: for illustrative purposes, I haven't actually tried this yet

from metaflow import batch as metaflow_batch
from metaflow.metaflow_config import METAFLOW_CONFIG

def batch(*args, **kwargs):
    """
    Turns @batch "off" if NO_BATCH == True
    again... for illustrative purposes, I haven't actually tried this yet
    """
    if METAFLOW_CONFIG["NO_BATCH"] == False:
        return metaflow_batch(*args, **kwargs)
    return lambda func: func

and then just use my batch decorator instead, having set NO_BATCH = false in my integration Metaflow config:

# my_flow.py
from metaflow import Flow, step, ... # etc
# use my_batch.batch rather than metaflow.batch
from my_package.my_batch import batch

class MyFlow(Flow):
   @batch
   @step
   def my_step(...):
      ... # etc

But perhaps there is something obvious I'm missing, or even a localstack-like approach I can take?



from Is there a recommended way to mock AWS Batch for integration tests in Metaflow?

No comments:

Post a Comment