Bug Issue
I found a performance bug on memory
with func keras.layers.LSTMCell
, the doc of LSTMCell
shows its description as below:
https://github.com/keras-team/keras/blob/ce0d2788b76119bc778f3d094816b0a9fc2b9748/keras/src/layers/rnn/lstm.py#L27-L29
See the repro below, with TensorFlow 2.19.0,
We can see the memory used increased
from 164608
to 165120
after transferring the value of recurrent_activation
from 'sigmoid'
to 'hard_sigmoid'
:
Repro 1
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
batch_size, timesteps, input_dim = 32, 10, 5
input_data = tf.random.normal([batch_size, timesteps, input_dim])
lstm_cell = tf.keras.layers.LSTMCell(units=64, activation='relu', dropout=0.2, recurrent_activation='sigmoid') ## Choice 1: sigmoid
# lstm_cell = tf.keras.layers.LSTMCell(units=64, activation='relu', dropout=0.2, recurrent_activation='hard_sigmoid') ## Choice 2: hard_sigmoid
rnn_layer = tf.keras.layers.RNN(lstm_cell, return_sequences=True)
output = rnn_layer(input_data)
# Main Code <--
memory = 0
for i in range(len(gpus)):
memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print("Memory Used:", memory)
Output 1
## For Choice 1: sigmoid
Memory Used: 164608
## For Choice 2: hard_sigmoid
Memory Used: 165120
To tell the truth, I'm not sure if the difference in Output 1
is expected.
But in my opiinon, the memory used in Repro 1
should not change, for that both activation functions have identical memory characteristics (element-wise operations with same input/output tensor dimensions):
For sigmoid
func, codes shown here:
https://github.com/keras-team/keras/blob/ce0d2788b76119bc778f3d094816b0a9fc2b9748/keras/src/activations/activations.py#L482-L506
For hard_sigmoid
func, codes shown here:
https://github.com/keras-team/keras/blob/ce0d2788b76119bc778f3d094816b0a9fc2b9748/keras/src/activations/activations.py#L519-L539
To verify my opinion above, I tried some more repos below, which shows that the shape and type between two func are matched with each other:
Repro 2
import keras
import tensorflow as tf
batch_size, timesteps, input_dim = 32, 10, 5
input_data = tf.random.normal([batch_size, timesteps, input_dim])
y1 = keras.src.ops.hard_sigmoid(input_data)
y2 = keras.src.ops.sigmoid(input_data)
print("y1:", y1.shape, '\n', y1)
print("y2:", y2.shape, '\n', y2)
Output 2
y1: (32, 10, 5)
tf.Tensor(
[[[0.5575269 0.54280454 0.661077 0.7442135 0.54959637]
[0.590985 0.15000606 0.4363846 0.6136034 0.47918186]
[0.38167724 0.3758458 0.19742984 0.44487846 0.22357215]
...
[0.53210396 0.3301783 0.49853078 0.5353122 0.43356502]
[0.58335984 0.48384008 0.470204 0.2632173 0.2183072 ]
[0.63430065 0.3395311 0.62688303 0.47066614 0.22561677]]], shape=(32, 10, 5), dtype=float32)
y2: (32, 10, 5)
tf.Tensor(
[[[0.5854437 0.5638562 0.72441375 0.81233907 0.5738503 ]
[0.63318616 0.10910036 0.4057187 0.6641003 0.4688133 ]
[0.32961282 0.32192805 0.1399842 0.41806313 0.15995441]
...
[0.5480076 0.26523578 0.4977962 0.552771 0.40164632]
[0.6224967 0.47577906 0.45542464 0.19455245 0.15575518]
[0.6912147 0.27631527 0.6816356 0.45611244 0.1616097 ]]], shape=(32, 10, 5), dtype=float32)
Thanks for noting!