Friday, 4 December 2020

Serialize Variable Number of Binary Instance Masks with Tensorflow's tfrecord Format

For the MS Coco 2014 dataset, each image has a variable number of bounding boxes, and corresponding binary instance masks that can be obtained from the instance polygon given in the annotation file. I achieve this using pycocotools (in particular the coco.py file). Now I wish to serialize image info using Tensorflow's tfrecords format. After reading the annotations to a python dict, indexed by each image id, I was able to serialize variable numbers of bounding boxes like:

x_min_values = []
x_max_values = []
y_min_values = []
y_max_values = []
for bb in bounding_boxes:
    x_min_values.append(int(bb[0]))
    y_min_values.append(int(bb[1]))
    x_max_values.append(int(bb[2]))
    y_max_values.append(int(bb[3]))

And then for my feature dict to be used in tf.train.Example, I converted each list to an int64 feature list as :

def _int64_feature_list(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) 

But now the issue is that since instance masks are 2 dimensional, I am not sure what strategy I should use to serialize them. If there was only one mask, as in a segmentation mask, then I could simply flatten the array, and write a 64 bit feature list, then use the image height and width to reshape the array when deserializing, but I can't do this for variable numbers of masks. Any insights appreciated.



from Serialize Variable Number of Binary Instance Masks with Tensorflow's tfrecord Format

No comments:

Post a Comment