-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Currently if is not possible to have an empty sequence with unknown shapes:
Like: tfds.features.Sequence(tfds.feature.Image(shape=(None, None, 3)))
E.g. adding the following test in https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/core/features/sequence_feature_test.py will fail:
self.assertFeature(
feature=feature_lib.Sequence(tfds.feature.Image(shape=(None, None, 3)),
shape=(None, None, None, 3), # (length, h, w, c)
tests=[
testing.FeatureExpectationItem(
value=[], # Empty input
expected=[], #
),
)tf.map_fn will fail with InvalidArgumentError: Tried to stack elements of an empty list with non-fully-defined element_shape: [?,?,1]:
This is because this code get executed during decoding:
imgs = tf.map_fn(
decode_image, # () -> (None, None, 3)
encoded_imgs, # Tensor(shape=(0,), dtype=tf.string)
dtype=tf.uint8,
)This is triggered when:
- Image shapes are dynamic: `decode_example has in/out shapes: () -> (None, None, 3)
- When the sequence is empty (encoded_imgs[0] == 0) (here when the image has no mask).
tf.map_fn try to decode: Tensor(shape=(0,), dtype=tf.string) -> Tensor(shape=(0, None, None, 3), dtype=tf.uint8)
Because there is no images to decode, TF cannot infer the implicit missing dimensions (0, None, None, 3).
To fix the issue, we should try to update the decode_batch_example in
| return tf.map_fn( |
def decode_batch_example(self, tfexample_data):
if None in self.shape[1:] and tfexample_data.shape[0] is None:
# Length and shape unknown, should deal with length == 0 case
# as (0, None, None, 3) isn't supported by tf.map_fn
zero_shape = (0 if s is None else s for s in self.shape)
return tf.cond(
tf.len(tfexample_data) == 0,
lambda: tf.constant(shape=(0, 0, 0, self.shape[-1]), dtype=self.dtype),
lambda: _decode_batch_example(tfexample_data),
)
else: # Shape statically defined
return _decode_batch_example(tfexample_data)Note: the tf.len() and likely do not work as-is, but you get the idea. Or something like that. Implementation should also support the nested case (with the help of tf.nest API), like:
self.assertFeature(
feature=feature_lib.Sequence({'a': tfds.feature.Image(shape=(None, None, 3), 'b': tf.int32}),
shape={
'a': (None, None, None, 3),
'b': (None,),
},
tests=[
testing.FeatureExpectationItem( # This works
value=[np.ones(8, 8, 3) for _ in range(4)],
expected=np.ones(4, 8, 8, 3),
),
testing.FeatureExpectationItem(
value=[], # Empty input
expected=[], #
),
)