Monday, 6 January 2020

Custom TensorFlow Keras optimizer

Suppose I want to write a custom optimizer class that conforms to the tf.keras API (please note that I am currently using TensorFlow 2.0.0). I am confused about the documented way to do this versus what's done in implementations.

The documentation for tf.keras.optimizers.Optimizer states,

  ### Write a customized optimizer.
  If you intend to create your own optimization algorithm, simply inherit from
  this class and override the following methods:
    - resource_apply_dense (update variable given gradient tensor is dense)
    - resource_apply_sparse (update variable given gradient tensor is sparse)
    - create_slots (if your optimizer algorithm requires additional variables)

However, the current tf.keras.optimizers.Optimizer implementation does not define a resource_apply_dense method, but it does define a private-looking _resource_apply_dense method stub. Similarly, there are no resource_apply_sparse or create_slots methods, but there are a _resource_apply_sparse method stub and a _create_slots method call.

In official tf.keras.optimizers.Optimizer subclasses (using tf.keras.optimizers.Adam as an example), there are _resource_apply_dense, _resource_apply_sparse, and _create_slots methods, and there are no such methods without the leading underscore.

There are similar leading-underscore methods in slightly-less-official tf.keras.optimizers.Optimizer subclasses (e.g., tfa.optimizers.MovingAverage from TensorFlow Addons).

Another confounding point for me is that the TensorFlow Addons optimizers also override the apply_gradients method, whereas the tf.keras.optimizers optimizers do not.

Moreover, I noticed that the apply_gradients method of tf.keras.optimizers.Optimizer method calls _create_slots, but the base tf.keras.optimizers.Optimizer class does not have a _create_slots method. So, it seems that a _create_slots method must be defined in an optimizer subclass if that subclass does not override apply_gradients.


Questions

What is the correct way to subclass a tf.keras.optimizers.Optimizer? Specifically,

  1. Does the tf.keras.optimizers.Optimizer documentation listed at the top simply mean to override the leading-underscore versions of the methods they mention (e.g., _resource_apply_dense instead of resource_apply_dense)? If so, are there any API guarantees about these private-looking methods not changing their behavior in future versions of TensorFlow?
  2. When would one override apply_gradients in addition to the _apply_resource_[dense|sparse] methods?


from Custom TensorFlow Keras optimizer

No comments:

Post a Comment