Bug description

We met a problem when building the Superset chart. It shows the Column Not Found problem when I select a date without any data. When I select a date with data, the error will not be shown. I have checked the following possible, but I still can not find out the problem. It seems a bug here. - I copy the SQL and run it in the Superset SQL Lab, and it runs fine. - I have synced the dataset columns.

How to reproduce the bug

  1. Create a virtual dataset. It includes the conditions by using the SQL template in its subquery. eg. dt_from, dt_to
  2. Create a chart using the dataset.
  3. Select the time range that the subquery could not get any matched data.
  4. Then, you will get the "Column Not Found" error.

Screenshots/recordings

Error Screen Apache Superset Get “Column Not Found” Error while no matched data from the virtual dataset

Error Log

Traceback (most recent call last):
  File "/app/superset/connectors/sqla/models.py", line 982, in adhoc_column_to_sqla
    raise SupersetGenericDBErrorException("Column not found")
superset.exceptions.SupersetGenericDBErrorException: Column not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/views/base_api.py", line 127, in wraps
    raise ex
  File "/app/superset/views/base_api.py", line 121, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/app/superset/utils/core.py", line 1526, in time_function
    response = func(*args, **kwargs)
  File "/app/superset/utils/log.py", line 255, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 256, in data
    return self._get_data_response(
  File "/app/superset/charts/data/api.py", line 415, in _get_data_response
    result = command.run(force_cached=force_cached)
  File "/app/superset/charts/data/commands/get_data_command.py", line 45, in run
    payload = self._query_context.get_payload(
  File "/app/superset/common/query_context.py", line 100, in get_payload
    return self._processor.get_payload(cache_query_context, force_cached)
  File "/app/superset/common/query_context_processor.py", line 590, in get_payload
    query_results = [
  File "/app/superset/common/query_context_processor.py", line 591, in <listcomp>
    get_query_results(
  File "/app/superset/common/query_actions.py", line 225, in get_query_results
    return result_func(query_context, query_obj, force_cached)
  File "/app/superset/common/query_actions.py", line 103, in _get_full
    payload = query_context.get_df_payload(query_obj, force_cached=force_cached)
  File "/app/superset/common/query_context.py", line 121, in get_df_payload
    return self._processor.get_df_payload(
  File "/app/superset/common/query_context_processor.py", line 159, in get_df_payload
    query_result = self.get_query_result(query_obj)
  File "/app/superset/common/query_context_processor.py", line 237, in get_query_result
    result = query_context.datasource.query(query_object.to_dict())
  File "/app/superset/connectors/sqla/models.py", line 1153, in query
    query_str_ext = self.get_query_str_extended(query_obj)
  File "/app/superset/connectors/sqla/models.py", line 813, in get_query_str_extended
    sqlaq = self.get_sqla_query(**query_obj)
  File "/app/superset/models/helpers.py", line 1620, in get_sqla_query
    outer = self.adhoc_column_to_sqla(
  File "/app/superset/connectors/sqla/models.py", line 985, in adhoc_column_to_sqla
    raise ColumnNotFoundException(message=str(ex)) from ex
superset.exceptions.ColumnNotFoundException: Column not found

Superset version

3.0.2

Python version

3.9

Node version

16

Browser

Chrome

Additional context

Checklist

  • [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: vioao

 def adhoc_column_to_sqla(  # pylint: disable=too-many-locals
        self,
        col: AdhocColumn,
        force_type_check: bool = False,
        template_processor: BaseTemplateProcessor | None = None,
    ) -> ColumnElement:
     // .....
     col_desc = get_columns_description(self.database, sql)
                    if not col_desc:
                        raise SupersetGenericDBErrorException("Column not found")
     // .....


def get_columns_description(
    database: Database,
    query: str,
) -> list[ResultSetColumnType]:
    db_engine_spec = database.db_engine_spec
    try:
        with database.get_raw_connection() as conn:
            cursor = conn.cursor()
            query = database.apply_limit_to_sql(query, limit=1)
            cursor.execute(query)
            db_engine_spec.execute(cursor, query)
            // we could not get data if there is not any data for this query
            result = db_engine_spec.fetch_data(cursor, limit=1)
            result_set = SupersetResultSet(result, cursor.description, db_engine_spec)
            return result_set.columns
    except Exception as ex:
        raise SupersetGenericDBErrorException(message=str(ex)) from ex

From the code, it seems only if the query could fetch at least one record, then it could get the result_set.columns. Otherwise, you will get a Column not found exception.

Comment From: Sbhatangar

I am also facing the same issue. Apache Superset Get “Column Not Found” Error while no matched data from the virtual dataset When I replace the Record Type column which is a lookup to another table it works. This is the query I have used: Apache Superset Get “Column Not Found” Error while no matched data from the virtual dataset

Comment From: rusackas

We no longer support Superset 3.0, and this issue has gone stale. If it's still relevant in Superset 3.1 or 4.0, please let us know, otherwise this'll be closed before long. Meanwhile, perhaps @dosu-bot can help.

Comment From: dosubot[bot]

Hello, @rusackas! I'm here to assist you with any bugs, questions, or contributions you may have. I'm Dosu's bot, and I'm ready to help while you wait for a human maintainer.

The "Column Not Found" error you're encountering when no data matches the date range in your virtual dataset is likely due to how the adhoc_column_to_sqla and get_columns_description functions handle queries that return no data.

Key Points:

  1. adhoc_column_to_sqla Function:
  2. This function converts an adhoc column definition into a SQLAlchemy column.
  3. If the column is not found in the metadata, it attempts to probe the column type by executing a limited query.
  4. If the query returns no data, it raises a SupersetGenericDBErrorException with the message "Column not found" [1].

  5. get_columns_description Function:

  6. This function fetches column descriptions by executing a limited query.
  7. If the query returns no data, it still attempts to fetch the column metadata from the cursor's description.
  8. If it fails to fetch the metadata, it raises a SupersetGenericDBErrorException [2].

Potential Solutions:

  1. Ensure Column Exists in All Scenarios:
  2. Modify your SQL template to ensure that the column exists even when no data matches the date range. For example, you can use a COALESCE function or a default value to ensure the column is always present.

  3. Handle Empty Results Gracefully:

  4. Update the get_columns_description function to handle scenarios where no data is returned more gracefully. For example, you can modify the function to return an empty list of columns instead of raising an exception.

  5. Check Database Engine Specifications:

  6. Ensure that your database engine supports fetching metadata for queries that return no data. Some database engines might have specific configurations or limitations that affect this behavior [3][4].

Example Modification:

You can modify the get_columns_description function to handle empty results more gracefully:

def get_columns_description(
    database: Database,
    query: str,
) -> list[ResultSetColumnType]:
    db_engine_spec = database.db_engine_spec
    try:
        with database.get_raw_connection() as conn:
            cursor = conn.cursor()
            query = database.apply_limit_to_sql(query, limit=1)
            cursor.execute(query)
            db_engine_spec.execute(cursor, query)
            result = db_engine_spec.fetch_data(cursor, limit=1)
            if not result:
                return []  # Return an empty list if no data is returned
            result_set = SupersetResultSet(result, cursor.description, db_engine_spec)
            return result_set.columns
    except Exception as ex:
        raise SupersetGenericDBErrorException(message=str(ex)) from ex

This modification ensures that the function returns an empty list of columns if no data is returned, preventing the "Column not found" error.

By implementing these changes, you should be able to avoid the "Column Not Found" error when no data matches the date range in your virtual dataset.

To continue talking to Dosu, mention @dosu.

Comment From: vioao

Same problem in 4.0.2

Comment From: rusackas

I think we need some sort of example dataset and some reproducible steps, so we can try to replicate/troubleshoot this, otherwise it may continue to be stale/inactive and be at risk of closing. Meanwhile, I assume it's still an issue in 4.1.2, or have we gotten lucky?

Comment From: rusackas

OK, so this one still hasn't been corroborated for 4.1.2, or 5.0.0 for that matter, and has essentially been inactive for upward of a year without a reproducible test case, so I'm going to go ahead and deem this one "not planned."

That having been said, I'm happy to reopen it if there's a way we can reproduce it reliably in 5.0! Thanks, y'all :D