EDIT: This was a two-part issue. The original description remains but title has been changed. See issue #33245 for second part of issue related to y-axis sorting.
Bug description
Observed on Superset 5.0.0, the deployment relies on a fork of Superset OSS version 5.0.0rc1. It's deployed in a EC2 instance using docker compose.
The Heatmap chart seems to sort the Y-axis through some unknown logic.
A similar issue was raised before for Heatmap chart at https://github.com/apache/superset/issues/31318 and was stated to be fixed in PR https://github.com/apache/superset/pull/31752. Another similar issue: https://github.com/apache/superset/issues/32591 still open, but the issue reporter has a different Superset version. Regardless of whether there is a string or numeric value on the Y-axis, sorting does not seem to make sense as it does not seem to be in alphabetical or numeric order.
In addition, it renders all the 0 values as
Screenshots/recordings
In this screenshot:
- 0 values rendered as
<NULL>
on axes: - 0 in
ret_col
andret_row
is displayed as<NULL>
on the X and Y axes. -
The raw data clearly includes 0 (see Results on the bottom right corner), so it should not be shown as null.
-
Y-axis sorting is inconsistent:
- Even when “Sort Y Axis” is set to “Axis ascending”, the Y-axis values are not ordered numerically.
- Y-axis order appears to depend on the order of values in the X-axis or underlying data groupings, this can be observed in the Results on the bottom right corner:
ret_row
is what on the y-axis, and it's order in the Results is the same as the order on the heatmap.
In this screenshot:
- When
test_value
is 0, it's not being displayed on the plot. It doesn't have a grid with 0 in it, as the grids for test_value=1. I also don't see anything when hover over it.
Superset version
master / latest-dev
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
Hey @fzhengTT! I'm Dosu, a bot here to help you with bugs, answer questions, and guide you on your journey to becoming a contributor. I'm ready to assist while you wait for a human maintainer. Let's tackle this together!
A similar issue was previously reported in GitHub issue #31318, which discussed problems with the Y-axis sorting logic and rendering of zero values in the heatmap chart. This issue was closed with a fix provided in PR #31752 [1].
For the Y-axis sorting, the logic is defined by the sort_y_axis
control, which allows sorting either alphabetically or based on metric values in ascending or descending order [2]. If the sorting is not behaving as expected, it might be worth checking if the correct sorting option is selected.
Regarding the rendering of zero values, the heatmap chart handles null or empty values based on the considerMissingDataAsZero
option. If this option is set to true
, missing data points are treated as zero values, which might affect how zero values are rendered [3].
If the issue persists despite these configurations, it might be related to the specific fork or version you are using.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Comment From: rusackas
I haven't had a chance to try to replicate this, but if you are able to do so with existing datasets/examples, that might help us out, so we have easier repro steps.
Just noting, that you have some Y axis sort controls. I see you're using Axis Ascending (the logic of which I haven't looked into), but I'm curious if Metric works better for you.:
Also noting that as per the docs, it is inadvisable to deploy using docker compose.
Comment From: sfirke
This would be better as two issues, the multi-part nature makes discussion messy. That said I am seeing the 0 rendered as NULL in 4.1.2. I did not see that with the same chart in 4.0.2 so it's a regression during that time. Here's a 4.1.2 chart:
Comment From: sfirke
I submitted a fix for the
Comment From: michael-s-molina
@sfirke Thanks for submitting a fix for the zeros.
Y-axis sorting is inconsistent:
Even when “Sort Y Axis” is set to “Axis ascending”, the Y-axis values are not ordered numerically. Y-axis order appears to depend on the order of values in the X-axis or underlying data groupings, this can be observed in the Results on the bottom right corner: ret_row is what on the y-axis, and it's order in the Results is the same as the order on the heatmap.
@fzhengTT If you have both x-axis and y-axis sorting set, x-axis sorting will be applied first and that's why the y-axis might not follow an order. If you want to sort by the y-axis only, you need to clear the x-axis sorting.
Comment From: sfirke
@michael-s-molina prior to v 4.1 you could have both axes sort in order. I have a heatmap where one axis is hours of the day from 0 to 23 and the other is days of the week from 1 to 7. It only works as a visual if they are both sorted in order.
Comment From: michael-s-molina
@sfirke The new Heatmap chart uses server-side ordering which is more correct given the limit restrictions that might be imposed by the query. The order controls are translated to ORDER BY clauses in the generated SQL. So, if you set X and Y, you will have something like ORDER BY X ASC, Y ASC. The previous fix gave the flexibility to order by just one axis.
If this logic does not work for a specific use case, we might need additional controls and a contribution is welcome.
Comment From: fzhengTT
@sfirke Thanks for submitting a fix for the zeros.
Y-axis sorting is inconsistent:
Even when “Sort Y Axis” is set to “Axis ascending”, the Y-axis values are not ordered numerically. Y-axis order appears to depend on the order of values in the X-axis or underlying data groupings, this can be observed in the Results on the bottom right corner: ret_row is what on the y-axis, and it's order in the Results is the same as the order on the heatmap.
@fzhengTT If you have both x-axis and y-axis sorting set, x-axis sorting will be applied first and that's why the y-axis might not follow an order. If you want to sort by the y-axis only, you need to clear the x-axis sorting.
@michael-s-molina Hi, ORDER BY X ASC, Y ASC does not work for my specific use case. I'll need to sort by A ASC and Y ASC independently. And agreed with @sfirke , I believe that used to be the case in 4.0 or 4.1 version, before my team upgraded to 5.0.0rc. Hope this independent sorting feature can be implemented.
Comment From: michael-s-molina
@fzhengTT The new Heatmap was delivered a year ago in 4.1. It's not something new in 5.0.
Hope this independent sorting feature can be implemented.
Let's hope someone from the community has bandwidth to implement this feature. I would recommend separating the zero bug from the feature request so we can close the bug and open a discussion for the feature.
Comment From: sfirke
I agree that we need to separate these issues. I will close this one, I have redone the title and put in a quick edit in the first post. I will make a new issue and link back here for the y-axis sorting.
Comment From: sfirke
I opened the new issue here: https://github.com/apache/superset/issues/33245 I noted that it is looking for a community member to fix it.
Comment From: michael-s-molina
Thank you @sfirke!
Comment From: fzhengTT
Thank you @sfirke