Apache Superset Physical dataset SQL compilation error in charts, due to incorrect quoting (Snowflake, Superset 6.0.0rc1)

Bug description

Observed:

When a physical dataset is created and subsequently used in a chart, the following error is raised when trying to render the chart:

SQL compilation error: Object '"my_db.my_schema.my_table"' does not exist or not authorized.

Expected:

Chart renders.

Notes:

Connection string, privileges and table/schema/database names are all set up correctly. Apparently, this is due to the table reference being incorrectly quoted as a single identifier in the generated query for a chart with a physical dataset:

SELECT
  DATE_TRUNC('DAY', "ROW_INSERTED_TIME") AS "ROW_INSERTED_TIME",
  COUNT(*) AS "count"
FROM "my_db.my_schema.my_table"
GROUP BY
  DATE_TRUNC('DAY', "ROW_INSERTED_TIME")
ORDER BY
  "count" DESC
LIMIT 10000

If I create a virtual dataset on that exact Snowflake table, with the exact same database connection, it works as expected (since SQL Lab allows me to specify the table reference w/o any quotes): the chart renders, and the generated query looks like this:

SELECT
  DATE_TRUNC('DAY', "ROW_INSERTED_TIME") AS "ROW_INSERTED_TIME",
  COUNT(*) AS "count"
FROM (
  SELECT
    *
  FROM my_db.my_schema.my_table
) AS virtual_table
GROUP BY
  DATE_TRUNC('DAY', "ROW_INSERTED_TIME")
ORDER BY
  "count" DESC
LIMIT 10000

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.11

Node version

Not applicable

Browser

Chrome

Additional context

superset 6.0.0rc1 snowflake-sqlalchemy==1.7.6, 1.7.7 (latest - tested with both versions, same results)

The only somewhat similar issue that I found, is https://github.com/apache/superset/issues/34996 (a different engine - Doris vs. this is on Snowflake - hinting that this might be a bug in Superset or one of its dependencies, that is not specific to a single engine spec).

Also, I have a report from a coworker who tried using an older version of Superset with Snowflake, and didn't observe any issues while using physical datasets - so assuming this may be a regression in 6.0.0.

WARNING:superset.connectors.sqla.models:Query SELECT DATE_TRUNC('DAY', "ROW_INSERTED_TIME") AS "ROW_INSERTED_TIME", COUNT(*) AS "count" 
FROM "my_db.my_schema.my_table" GROUP BY DATE_TRUNC('DAY', "ROW_INSERTED_TIME") ORDER BY "count" DESC
LIMIT 10000 on schema my_schema failed
Traceback (most recent call last): 
File "/app/superset/connectors/sqla/models.py", line 1593, in query
df = self.database.get_df(
^^^^^^^^^^^^^^^^^^^^^ 
File "/app/superset/models/core.py", line 705, in get_df
self.db_engine_spec.execute(cursor, sql_, self) 
File "/app/superset/db_engine_specs/base.py", line 1853, in execute 
raise cls.get_dbapi_mapped_exception(ex) from ex 
File "/app/superset/db_engine_specs/base.py", line 1849, in execute 
cursor.execute(query) 
File "/app/.venv/lib/python3.11/site-packages/snowflake/connector/cursor.py", line 1134, in execute
Error.errorhandler_wrapper(self.connection, self, error_class, errvalue) 
File "/app/.venv/lib/python3.11/site-packages/snowflake/connector/errors.py", line 286, in errorhandler_wrapper 
handed_over = Error.hand_to_other_handler(
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/app/.venv/lib/python3.11/site-packages/snowflake/connector/errors.py", line 341, in hand_to_other_handler
cursor.errorhandler(connection, cursor, error_class, error_value)
File "/app/.venv/lib/python3.11/site-packages/snowflake/connector/errors.py", line 217, in default_errorhandler
raise error_class( 
snowflake.connector.errors.ProgrammingError: 002003 (42S02): 01bf0031-0811-88a3-0002-21017995f1db: SQL compilation error: 
Object '"my_db.my_schema.my_table"' does not exist or not authorized.

Checklist

[x] I have searched Superset docs and Slack and didn't find a solution to my problem.
[x] I have searched the GitHub issue tracker and didn't find a similar bug report.
[x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

Hi @donicek! I'm Dosu and I’m helping the superset team.

This issue happens when the physical dataset's table_name field contains the full identifier (like my_db.my_schema.my_table) instead of just the table part (like my_table). Superset expects catalog, schema, and table_name to be stored separately so it can quote each part correctly for Snowflake. If the full identifier is stored in table_name, Superset will quote the whole thing as a single identifier, which Snowflake can't resolve [details and workaround].

To fix this, check your dataset metadata (via YAML export, API, or database) and make sure: - catalog is set to your database name (e.g., my_db) - schema is set to your schema (e.g., my_schema) - table_name is just the table (e.g., my_table)

If you see the full identifier in table_name, update it to only the table name and re-import the dataset or patch it via the API. The Superset UI does not expose the catalog field for all databases, so you may need to use the REST API or YAML import/export to correct it.

This is not a config or connection issue, but a metadata integrity problem. Virtual datasets work because you control the SQL directly, bypassing this quoting logic. If your metadata is correct and you still see the issue, it could be a bug in the Snowflake SQLAlchemy dialect, but this is rare.

Let me know if you need a sample API payload or YAML snippet to update your dataset.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Comment From: donicek

Hi @dosu,

my dataset metadata is stored correctly: catalog is my_db, schema is my_schema and table_name is my_table. So this is not a metadata integrity problem.

As stated above, this seems to be a bug in Superset that is affecting multiple dialects.

Comment From: dosubot[bot]

Thanks for clarifying your metadata is correct. Based on a deep review of the Superset codebase, there are no recent changes in Superset 6.0.0rc1 that would explain this regression—Superset's logic for quoting catalog, schema, and table parts is correct and unchanged, and SnowflakeEngineSpec does not override this behavior. No merged pull requests or documentation address this issue, and no open issues are tracking it for Snowflake or other dialects.

Given that, this bug is likely caused by an external dependency (such as a change in SQLAlchemy or its Snowflake dialect) or a subtle edge case in how identifiers are handled during query generation, not a core Superset regression. If your metadata is correct and you still see the full identifier quoted as a single string, it may be worth checking for recent upgrades or changes in your SQLAlchemy or Snowflake SQLAlchemy versions, or testing with a different version of those dependencies to isolate the cause.

If you need to escalate or track this, there isn't an official Superset issue for this regression yet, but it does appear to affect multiple dialects and is not engine-specific.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Comment From: donicek

@dosu Actually, there were quite recent changes made to Superset that may be causing this issue: https://github.com/apache/superset/blob/master/superset/connectors/sqla/models.py#L1404-L1412

Both Doris and Snowflake are among the engines that have supports_cross_catalog_queries set to True.

Comment From: dosubot[bot]

You're right—recent changes in Superset (notably in PR #34360) modified how physical dataset identifiers are constructed when supports_cross_catalog_queries is True for engines like Snowflake and Doris. Now, Superset builds the full identifier as a single string (e.g., catalog.schema.table) and passes it to SQLAlchemy's table() function, which causes the entire identifier to be quoted as one token in the generated SQL. This leads to errors like FROM "my_db.my_schema.my_table", which Snowflake and Doris can't resolve because they expect each part to be quoted separately. This regression was introduced to support cross-catalog queries but has this side effect for these engines. If your metadata is already correct, there isn't a user-level workaround—this will likely need a patch in Superset or the relevant SQLAlchemy dialect to split and quote identifiers properly for these engines. More details and discussion are in PR #34360 and related Doris/Snowflake issues.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Comment From: donicek

Confirming that if I disable support for cross catalog queries in the Snowflake engine, physical datasets start working as expected and the chart renders.

As a quick and dirty workaround, cross catalog query support in a particular engine (Snowflake) can be disabled by putting something like the following snippet into superset_config.py:

def snowflake_disable_quoting(_):
    print("*** Disabling Snowflake cross catalog queries...")
    from superset.db_engine_specs import snowflake
    snowflake.SnowflakeEngineSpec.supports_cross_catalog_queries = False

FLASK_APP_MUTATOR = snowflake_disable_quoting

Comment From: sadpandajoe

@donicek have you tried to see if this is still an issue with 6.0.0rc2?