Bug description

When attempting to use a calculated column that uses jinja templating as the metric in a bar chart, the jinja placeholders are included in the ORDER BY clause of the executed query, rather than being replaced with rendered SQL string. The template renders correctly in the SELECT statement of the same query.

In this example, the calculated column is defined as

IIF(MeasureValue < 
    {% if filter_values('fresh_min_bcs')|length %} 
        {{ filter_values('fresh_min_bcs')[0] }} 
    {% else %} 
        2.75 
    {% endif %},
    1,
    0
)

The executed query (trimmed for brevity) is

WITH cteMeasure AS (SELECT ... FROM ...),
     __cte AS (SELECT ... )
SELECT TOP 10000 rpt.DateTrunc('QUARTER' , MeasureDate) AS [MeasureDate_Group],
           IIF(SortNum >= 3 , '3' , CAST(SortNum AS VARCHAR(2))) AS [SortGroup],
-- The template renders correctly on the following line
           sum(IIF(MeasureValue < 2.75, 1, 0)) AS [COUNT(Is Low BCS)]
FROM __cte
WHERE ...
GROUP BY ...
-- The template does not render correctly in the ORDER BY clause on the following line
ORDER BY sum(IIF(MeasureValue < {% if filter_values('fresh_min_bcs')|length %} {{ filter_values('fresh_min_bcs')[0] }} {% else %} 2.75 {% endif %}, 1, 0)) DESC;

How to reproduce the bug

  1. Create a dataset with a calcuated field that uses jinja templating
  2. Create a Bar Chart based on the above dataset
  3. Add the calculated column as a METRIC; the aggregration function does not matter
  4. Click UPDATE CHART
  5. An "Unexpected error" will be returned, citing "Incorrect syntax near '%'"; clicking on ... and "View query" will show the template text rather than rendered text in the ORDER BY clause of the executed query

Screenshots/recordings

No response

Superset version

4.0.1

Python version

3.10

Node version

Not applicable

Browser

Chrome

Additional context

  • Superset is installed via pip, in a virtual environment
  • ENABLE_TEMPLATE_PROCESSING feature flag is enabled
  • The dataset is a virtual dataset from a query against MS SQL Server 2019: Microsoft SQL Server 2019 (RTM-CU27) (KB5037331) - 15.0.4375.4 (X64) Jun 10 2024 18:00:07 Copyright (C) 2019 Microsoft Corporation Standard Edition (64-bit) on Windows Server 2019 Standard 10.0 <X64> (Build 17763: ) (Hypervisor)

Checklist

  • [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

Hey there, @lilotter45! I'm here to help you with any issues you're facing. I can assist with bug fixing, answering questions, and even guide you on becoming a contributor. Let me know how I can assist you!

To resolve the issue where Jinja placeholders in a calculated column are not being replaced with the rendered SQL string in the ORDER BY clause of a query in Superset, you can ensure that the calculated column is properly defined and referenced in the query object. The provided test case test_calculated_column_in_order_by demonstrates how to define a calculated column and use it in the ORDER BY clause.

Here is the relevant part of the test case:

@mock.patch("superset.models.core.Database.db_engine_spec", BigQueryEngineSpec)
@mock.patch("sqlalchemy_bigquery._helpers.create_bigquery_client", mock.Mock)
@pytest.mark.usefixtures("load_birth_names_dashboard_with_slices")
def test_calculated_column_in_order_by(self):
    table = self.get_table(name="birth_names")
    TableColumn(
        column_name="gender_cc",
        type="VARCHAR(255)",
        table=table,
        expression="""
        case
          when gender='boy' then 'male'
          else 'female'
        end
        """,
    )

    table.database.sqlalchemy_uri = "bigquery://"
    query_obj = {
        "groupby": ["gender_cc"],
        "is_timeseries": False,
        "filter": [],
        "orderby": [["gender_cc", True]],
    }
    sql = table.get_query_str(query_obj)
    assert "ORDER BY\n  `gender_cc` ASC" in sql

In this test case: 1. A calculated column gender_cc is defined with a SQL expression. 2. The query_obj includes the groupby and orderby clauses referencing the calculated column gender_cc. 3. The generated SQL string is checked to ensure that the ORDER BY clause correctly includes the calculated column [1].

Additionally, ensure that the SQL string is properly rendered with the Jinja template engine. The get_rendered_sql method in superset/models/helpers.py is responsible for rendering the SQL with the template engine:

def get_rendered_sql(
    self,
    template_processor: Optional[BaseTemplateProcessor] = None,
) -> str:
    """
    Render sql with template engine (Jinja).
    """
    sql = self.sql.strip("\t\r\n; ")
    if template_processor:
        try:
            sql = template_processor.process_template(sql)
        except TemplateError as ex:
            raise QueryObjectValidationError(
                _(
                    "Error while rendering virtual dataset query: %(msg)s",
                    msg=ex.message,
                )
            ) from ex

    script = SQLScript(sql, engine=self.db_engine_spec.engine)
    if len(script.statements) > 1:
        raise QueryObjectValidationError(
            _("Virtual dataset query cannot consist of multiple statements")
        )

    if not sql:
        raise QueryObjectValidationError(_("Virtual dataset query cannot be empty"))
    return sql

Moreover, the make_orderby_compatible method in superset/connectors/sqla/models.py ensures that aliases for selected columns are not used in the ORDER BY clause if the database does not support it:

def make_orderby_compatible(
    self, select_exprs: list[ColumnElement], orderby_exprs: list[ColumnElement]
) -> None:
    """
    If needed, make sure aliases for selected columns are not used in
    `ORDER BY`.

    In some databases (e.g. Presto), `ORDER BY` clause is not able to
    automatically pick the source column if a `SELECT` clause alias is named
    the same as a source column. In this case, we update the SELECT alias to
    another name to avoid the conflict.
    """
    if self.db_engine_spec.allows_alias_to_source_column:
        return

    def is_alias_used_in_orderby(col: ColumnElement) -> bool:
        if not isinstance(col, Label):
            return False
        regexp = re.compile(f"\\(.*\\b{re.escape(col.name)}\\b.*\\)", re.IGNORECASE)
        return any(regexp.search(str(x)) for x in orderby_exprs)

    for col in select_exprs:
        if is_alias_used_in_orderby(col):
            col.name = f"{col.name}__"

By following this approach and ensuring the SQL string is properly rendered and compatible with the database's requirements, you can ensure that Jinja placeholders in calculated columns are properly replaced with the rendered SQL string in the ORDER BY clause [2][3].

To continue talking to Dosu, mention @dosu.

Comment From: rusackas

Are you still facing this issue? It's been quiet for a long time, and may be a better fit for a Help/Q&A Discussion.

Comment From: lilotter45

Yes, I am still facing this issue. I actually just came across it via a web search while facing the same issue in another dataset.

I'm not clear why this would be a better fit for a discussion. Should the jinja template not be rendered consistently between the SELECT, WHERE, GROUP BY, and ORDER BY clauses?

Another example:

CASE
  WHEN '{% if filter_values('arrival_date_field')|length %}{{ filter_values('arrival_date_field')[0] }}{% else %}Birth Date{% endif %}' = 'Arrival Date' THEN 
    arrival_date 
  ELSE 
    birth_date 
  END

evaluates to 'Birth Date' in the SELECT and WHERE clauses, but is interpreted as the raw string in the ORDER BY clause:

ORDER BY MAX(CASE
  WHEN '{% if filter_values('arrival_date_field')|length %}{{ filter_values('arrival_date_field')[0] }}{% else %}Birth Date{% endif %}' = 'Arrival Date' THEN 
    arrival_date 
  ELSE 
    birth_date 
  END) DESC

When manually replacing the jinja template in SQL Lab with the string it would evaluate to, the query executes successfully.

Comment From: rusackas

I wonder if @Vitor-Avila or @villebro have any input here, as they're MUCH more versed with the ins and outs of Jinja than I am. @dosu might be able to help here as well, better than it did a year ago :)

Comment From: dosubot[bot]

This is a nuanced issue with Superset's Jinja templating in calculated columns, especially in the ORDER BY clause. Based on recent code and PRs, Superset is designed to process Jinja templates for calculated columns and metrics by applying a template processor to SQL expressions before query execution, including those used in ORDER BY clauses (superset/connectors/sqla/models.py, superset/models/helpers.py).

However, inconsistencies can occur if the rendered (processed) expression isn't carried through all clause constructions. This is especially true for dynamic calculated columns referenced in ORDER BY, which may explain why you see raw Jinja in ORDER BY but not in SELECT or WHERE. Recent PRs, notably #30735 by Vitor-Avila, improved handling of calculated columns and metrics using Jinja, ensuring their SQL expressions are collected and processed for template rendering in all relevant query parts. Tests in these PRs assert that calculated columns are correctly included in ORDER BY, but the issue may persist in 4.0.1.

If you're able to upgrade, Superset 4.1.0 and later include fixes for extra cache keys and improved Jinja handling in calculated columns and metrics. If not, a practical workaround is to avoid using Jinja in calculated columns that are referenced in ORDER BY, or manually substitute the evaluated value as you did in SQL Lab.

If you need more technical details or want to discuss possible code-level solutions, recent changes and tests in #30735 and #30350 are good references for how this logic is evolving. If the issue persists after upgrading, it may be worth reviewing how calculated columns are defined and referenced in your dataset.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X