Sunday, 26 September 2021

tensorflow dataset splitting by participant

I want to split a tf.data.DataSet on attributes, in my case participant or gesture. Currently, the dataset is a work in progress and the number of participants/gestures may grow. I originally set up a tfds config for this gestures-dataset, but I haven't figured out how to configure participant/gesture splitting here either.

How should I split the tf.data.DataSet object? Currently, my data set exists as a single tf_record. I would prefer to keep it that way instead of generating different files for each participant and gesture and have to regenerate all the gesture tf-records when a new participant is added.

this works (method 1, gross):

subset = ds.filter(lambda x: (x['participant'] == 1 or x['participant'] == 2))

this doesn't (method 2, dream):

subset = ds.filter(lambda x: any(x['participant'] == p for p in [1,2]))

OperatorNotAllowedInGraphError: using a tf.Tensor as a Python bool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

I also attempted the same op as a decorated @tf.function.

example code with publicly available mnist dataset: juptyer notebook on colab

Is their a way to perform this operation in a similar style to method 2?



from tensorflow dataset splitting by participant

No comments:

Post a Comment