Problem
When using jax
backend with MeanIoU
or its child class, training throws an error.
Code to reproduce
https://gist.github.com/savindi-wijenayaka/43da7ac5930afc3ffbf20686ecca1193
Please add the ignore_class=0
to the MeanIoU
and OneHotMeanIoU
inititialization step.
Observations
- Without
ignore_class
: train the model - With
ignore_class
: Throws an error ->Array boolean indices must be concrete; got ShapedArray(bool[2097152])
Warnings noticed in logs
W external/xla/xla/service/gpu/nvptx_compiler.cc:718] The NVIDIA driver's CUDA version is 12.3 which is older than the ptxas CUDA version (12.4.131). Because the driver is older than the ptxas version, XLA is disabling parallel compilation, which may slow down compilation. You should update your NVIDIA driver or use the NVIDIA-provided CUDA forward compatibility packages.
Version details:
- OS: Red Hat Enterprise Linux 9.3 (Plow)
- GPU: NVIDIA H100 PCIe
- CUDA Version: 12.3
- NVIDIA-SMI 545.23.08
- Driver Version: 545.23.08
- jax: 0.4.26
Comment From: fchollet
Thanks for the report. I just fixed it at HEAD.
Comment From: SuryanarayanaY
Hi @savindi-wijenayaka ,
Could you please check and confirm whether we can mark this issue as resolved. Thanks!
Comment From: github-actions[bot]
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
Comment From: savindi-wijenayaka
Hi @fchollet and @SuryanarayanaY,
I installed keras from the merge commit after the fix (pip install git+https://github.com/keras-team/keras.git@fed28a7357e13aeb955f891747a1f9b26d5bc581
) and run the above code. No errors were thrown. However, there is a recurring warning:
'+ptx84' is not a recognized feature for this target (ignoring feature)
Comment From: sonali-kumari1
Hi @savindi-wijenayaka -
Are you still able to reproduce this issue ? Could you please share a simple standalone code to reproduce the issue? Thanks!
Comment From: github-actions[bot]
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
Comment From: github-actions[bot]
This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.