Bug description
Summary
There is an issue with filtering numeric values in Superset when values have different numbers of digits after the decimal point.
Steps to Reproduce
- Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35, etc.).
- First select an integer value (for example, 21) in the filter, and then fractional values.
- The query returns only the rows where the value is exactly 21 (i.e., without any decimal places), excluding 21.8, 25.35, etc.
- Now, if you select a decimal value first (e.g., 21.8 or 25.35), the filtering works as expected, returning all matching rows regardless of decimal precision.
Expected Behavior
Filtering by numeric value should match all equivalent numeric representations (e.g., 21, 21.8, 25.35) regardless of how many decimal places are used.
Actual Behavior
Filtering is inconsistent depending on whether an integer or decimal value is selected first. Selecting an integer leads to an overly strict match that excludes decimal representations of the same value.
Databases Affected
This issue has been observed across all three databases tested:
- ClickHouse
- Impala
- PostgreSQL
Screenshots/recordings
No response
Superset version
master / latest-dev
Python version
3.9
Node version
I don't know
Browser
Not applicable
Additional context
No response
Checklist
- [ ] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
The issue you're experiencing with numeric filtering in Superset, where selecting an integer value first results in an overly strict match that excludes decimal representations, might be related to how Superset handles numeric values and filters.
Superset has a NumericalRangeFilter
component that manages numeric range inputs and ensures that the minimum value is strictly less than the maximum value [1]. Additionally, the getRangeExtraFormData
function constructs filter clauses for numeric ranges, adding conditions for lower and upper bounds [2]. However, there might be a need to review how these components interact with different database backends like ClickHouse, Impala, and PostgreSQL, especially regarding decimal precision.
Furthermore, the handle_single_value
function in superset/models/helpers.py
checks if the value is a float or int and casts the string representation of a numeric value to a number for backward compatibility [3]. This might be relevant to your issue, as it suggests that numeric values are explicitly handled or transformed when applying filters.
Given the behavior you've described, it might be worth investigating whether the numeric filtering logic in Superset is correctly handling decimal precision across different database backends. You may want to explore the code related to numeric filtering and see if adjustments are needed to ensure consistent behavior regardless of the initial value selected.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Comment From: geido
Feels like a good first issue to me
Comment From: suraj-mandal
Can I work on this issue? I am working superset and would like to contribute to the product
Comment From: rusackas
@suraj-mandal go for it! I'll happily assign it to you if you want to call dibs on it :D Thank you for contributing!
Comment From: suraj-mandal
Thanks, getting right into it!
Comment From: Sayan199
Hi @suraj-mandal, Can you please include me as well? I would love to collaborate with you on this issue.
Comment From: suraj-mandal
Sure
Comment From: suraj-mandal
@dosu, can you tell me how to reproduce this issue?
Comment From: dosubot[bot]
To reproduce the issue with numeric filtering in Superset, follow these steps:
- Set Up a Filter:
-
Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35).
-
Select an Integer Value First:
- In the filter, first select an integer value (e.g., 21).
-
Observe that the query returns only the rows where the value is exactly 21, excluding decimal values like 21.8 and 25.35.
-
Select a Decimal Value First:
- Now, select a decimal value first (e.g., 21.8 or 25.35).
- Observe that the filtering works as expected, returning all matching rows regardless of decimal precision.
This issue has been observed across multiple databases, including ClickHouse, Impala, and PostgreSQL. The expected behavior is that filtering by numeric value should match all equivalent numeric representations, regardless of the number of decimal places.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Comment From: suraj-mandal
@dosu can you tell me the dataset that has been used here?
Comment From: suraj-mandal
@Sayan199 , I tested your file in my local development server, and there are some formatting issues. Moreover, I tested it on my server, but I can see that sometimes inconsistencies are coming. Do check once!
Comment From: suraj-mandal
@rusackas , any update on this issue?
Comment From: Sayan199
@Sayan199 , I tested your file in my local development server, and there are some formatting issues. Moreover, I tested it on my server, but I can see that sometimes inconsistencies are coming. Do check once!
Hi @suraj-mandal I believe there was an indent issue. I checked in my local, and it was working fine. Not sure why it broke in the pull request. But I saw your pull request. I tested it, and looks like it works.
Comment From: rusackas
Seems there are two PRs that might close this one out... hopefully we can get them across the finish line.