Bug description

Summary

There is an issue with filtering numeric values in Superset when values have different numbers of digits after the decimal point.

Steps to Reproduce

  1. Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35, etc.).

Image

  1. First select an integer value (for example, 21) in the filter, and then fractional values.

Image

  1. The query returns only the rows where the value is exactly 21 (i.e., without any decimal places), excluding 21.8, 25.35, etc.

Image

Image

  1. Now, if you select a decimal value first (e.g., 21.8 or 25.35), the filtering works as expected, returning all matching rows regardless of decimal precision.

Image

Image

Expected Behavior

Filtering by numeric value should match all equivalent numeric representations (e.g., 21, 21.8, 25.35) regardless of how many decimal places are used.

Actual Behavior

Filtering is inconsistent depending on whether an integer or decimal value is selected first. Selecting an integer leads to an overly strict match that excludes decimal representations of the same value.

Databases Affected

This issue has been observed across all three databases tested:

  • ClickHouse
  • Impala
  • PostgreSQL

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

I don't know

Browser

Not applicable

Additional context

No response

Checklist

  • [ ] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

The issue you're experiencing with numeric filtering in Superset, where selecting an integer value first results in an overly strict match that excludes decimal representations, might be related to how Superset handles numeric values and filters.

Superset has a NumericalRangeFilter component that manages numeric range inputs and ensures that the minimum value is strictly less than the maximum value [1]. Additionally, the getRangeExtraFormData function constructs filter clauses for numeric ranges, adding conditions for lower and upper bounds [2]. However, there might be a need to review how these components interact with different database backends like ClickHouse, Impala, and PostgreSQL, especially regarding decimal precision.

Furthermore, the handle_single_value function in superset/models/helpers.py checks if the value is a float or int and casts the string representation of a numeric value to a number for backward compatibility [3]. This might be relevant to your issue, as it suggests that numeric values are explicitly handled or transformed when applying filters.

Given the behavior you've described, it might be worth investigating whether the numeric filtering logic in Superset is correctly handling decimal precision across different database backends. You may want to explore the code related to numeric filtering and see if adjustments are needed to ensure consistent behavior regardless of the initial value selected.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

Comment From: geido

Feels like a good first issue to me

Comment From: suraj-mandal

Can I work on this issue? I am working superset and would like to contribute to the product

Comment From: rusackas

@suraj-mandal go for it! I'll happily assign it to you if you want to call dibs on it :D Thank you for contributing!

Comment From: suraj-mandal

Thanks, getting right into it!

Comment From: Sayan199

Hi @suraj-mandal, Can you please include me as well? I would love to collaborate with you on this issue.

Comment From: suraj-mandal

Sure

Comment From: suraj-mandal

@dosu, can you tell me how to reproduce this issue?

Comment From: dosubot[bot]

To reproduce the issue with numeric filtering in Superset, follow these steps:

  1. Set Up a Filter:
  2. Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35).

  3. Select an Integer Value First:

  4. In the filter, first select an integer value (e.g., 21).
  5. Observe that the query returns only the rows where the value is exactly 21, excluding decimal values like 21.8 and 25.35.

  6. Select a Decimal Value First:

  7. Now, select a decimal value first (e.g., 21.8 or 25.35).
  8. Observe that the filtering works as expected, returning all matching rows regardless of decimal precision.

This issue has been observed across multiple databases, including ClickHouse, Impala, and PostgreSQL. The expected behavior is that filtering by numeric value should match all equivalent numeric representations, regardless of the number of decimal places.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

Comment From: suraj-mandal

@dosu can you tell me the dataset that has been used here?

Comment From: suraj-mandal

@Sayan199 , I tested your file in my local development server, and there are some formatting issues. Moreover, I tested it on my server, but I can see that sometimes inconsistencies are coming. Do check once!

Comment From: suraj-mandal

@rusackas , any update on this issue?

Comment From: Sayan199

@Sayan199 , I tested your file in my local development server, and there are some formatting issues. Moreover, I tested it on my server, but I can see that sometimes inconsistencies are coming. Do check once!

Hi @suraj-mandal I believe there was an indent issue. I checked in my local, and it was working fine. Not sure why it broke in the pull request. But I saw your pull request. I tested it, and looks like it works.

Comment From: rusackas

Seems there are two PRs that might close this one out... hopefully we can get them across the finish line.