Bug Issue

I found a mismatch on IntegerLookup between doc and its usage, the doc of IntegerLookup shows its description as below:

https://github.com/keras-team/keras/blob/4e1b250491627f871f6c82c0dcb577cc21093def/keras/src/layers/preprocessing/integer_lookup.py#L113-L115

See the repro below, with TensorFlow 2.19.0 and Keras nightly:

Repro

import tensorflow as tf
import numpy as np

input_data = np.random.randint(0, 1000, size=(2, 2))
vocabulary = np.unique(input_data)
lookup_layer = tf.keras.layers.IntegerLookup(vocabulary=vocabulary, output_mode='one_hot')
output_data = lookup_layer(input_data)

print('Input Data:', input_data.shape, '\n', input_data)
print('vocabulary:', vocabulary.shape, '\n', vocabulary)
print('Output Data:', output_data.shape, '\n', output_data)

Output

Input Data: (2, 2) 
 [[486 408]
 [ 96 885]]
vocabulary: (4,) 
 [ 96 408 486 885]
Output Data: (2, 2, 5) 
 tf.Tensor(
[[[0 0 0 1 0]
  [0 0 1 0 0]]

 [[0 1 0 0 0]
  [0 0 0 0 1]]], shape=(2, 2, 5), dtype=int64)

In this repro, output_mode is "one_hot", with output up to rank 3, which is contradictory to the description above that for output modes that is not int, currently only output up to rank 2 is supported.

Suggestions

  • To check if this behavior is expected
  • (Some more findings) Add a new line between:

https://github.com/keras-team/keras/blob/4e1b250491627f871f6c82c0dcb577cc21093def/keras/src/layers/preprocessing/integer_lookup.py#L111-L112

and:

https://github.com/keras-team/keras/blob/4e1b250491627f871f6c82c0dcb577cc21093def/keras/src/layers/preprocessing/integer_lookup.py#L113-L115

Thanks a lot!

Comment From: sonali-kumari1

Hi @ILCSFNO - Thanks for reporting this issue. We will look into this and update you.