For the MS Coco 2014 dataset, each image has a variable number of bounding boxes, and corresponding binary instance masks that can be obtained from the instance polygon given in the annotation file. I achieve this using pycocotools (in particular the coco.py file). Now I wish to serialize image info using Tensorflow's tfrecords format. After reading the annotations to a python dict, indexed by each image id, I was able to serialize variable numbers of bounding boxes like:
x_min_values = []
x_max_values = []
y_min_values = []
y_max_values = []
for bb in bounding_boxes:
x_min_values.append(int(bb[0]))
y_min_values.append(int(bb[1]))
x_max_values.append(int(bb[2]))
y_max_values.append(int(bb[3]))
And then for my feature dict to be used in tf.train.Example
, I converted each list to an int64 feature list as :
def _int64_feature_list(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
But now the issue is that since instance masks are 2 dimensional, I am not sure what strategy I should use to serialize them. If there was only one mask, as in a segmentation mask, then I could simply flatten the array, and write a 64 bit feature list, then use the image height and width to reshape the array when deserializing, but I can't do this for variable numbers of masks. Any insights appreciated.
from Serialize Variable Number of Binary Instance Masks with Tensorflow's tfrecord Format
No comments:
Post a Comment