Doc Issue
The doc of keras.layers.IntegerLookup() shows its description as below:
https://github.com/keras-team/keras/blob/45c98ec8297f5a5b61c99a721c1801a11b7fec89/keras/src/layers/preprocessing/integer_lookup.py#L79-L80
But see the repro shown below, which version I tried in is of TensorFlow 2.19.0:
Repro
import tensorflow as tf
import numpy as np
input_data = np.random.randint(0, 1000, size=(5, 10))
lookup_layer = tf.keras.layers.IntegerLookup(vocabulary=np.unique(input_data), vocabulary_dtype='int32')
output_data = lookup_layer(input_data)
Output
ValueError: Only `vocabulary_dtype='int64'` is supported at this time. Received: vocabulary_dtype=int32
I find this ValueError is raised here:
https://github.com/keras-team/keras/blob/45c98ec8297f5a5b61c99a721c1801a11b7fec89/keras/src/layers/preprocessing/integer_lookup.py#L333-L338
Which shows that only vocabulary_dtype='int64' is supported at this time, conflicting with the description shown above.
Suggestions
- Fix the Doc to make them meet with each other or
- Fix the condition which raise error into something like
if vocabulary_dtype != "int64" and vocabulary_dtype != "int32":
Comment From: ILCSFNO
I fix the mismatch using the first choice of suggestions, for that:
The parent class of keras.layers.IntegerLookup() is keras.layers.IndexLookup():
https://github.com/keras-team/keras/blob/45c98ec8297f5a5b61c99a721c1801a11b7fec89/keras/src/layers/preprocessing/integer_lookup.py#L11
And the vocabulary_dtype suggested to be transferred to keras.layers.IndexLookup() is "int64" or "string", although I can't verify that other dtype is not supported:
https://github.com/keras-team/keras/blob/45c98ec8297f5a5b61c99a721c1801a11b7fec89/keras/src/layers/preprocessing/index_lookup.py#L43-L44
The shared dtype between these two classes is "int64".
If my opinion is wrong, please note me and then I'll change the condition in the PR just as I described above in the second Suggestion.
PR opened here: * https://github.com/keras-team/keras/pull/21587