Bug description

When attempting to create ClickHouse cluster tables (Distributed/Replicated engines) in Superset 5.0.rc3, the operation fails. Standard tables work correctly, but cluster tables throw errors during dataset creation.

Screenshots/recordings

CREATE TABLE analytics.moshi_test -- ON CLUSTER cluster_1S_2R ( appid String, uuid String, time String, server_id Int64 ) ENGINE = ReplicatedMergeTree ORDER BY (appid, uuid, time)

ClickHouse Error: clickhouse error: This may be triggered by: Issue 1002 - The database returned an unexpected error.

Superset version

master / latest-dev

Python version

3.10

Node version

18 or greater

Browser

Chrome

Additional context

No response

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

This error happens because Superset's SQL parsing layer (using sqlglot) doesn't currently handle the ON CLUSTER clause in ClickHouse DDL. When you try to create a Distributed or Replicated table with ON CLUSTER, Superset encounters an unhandled sqlglot.expressions.OnCluster AST node, which leads to a generic error (issue code 1002) and blocks dataset creation. Standard tables work because they don't use this clause.

Recent updates to sqlglot and Superset's SQL parsing logic have not added support for this ClickHouse-specific syntax yet, so this is a current limitation of Superset's ClickHouse integration reference.

As a workaround, you can create the cluster table directly in ClickHouse (outside Superset), then point Superset to the table for dataset creation—Superset should be able to read metadata from existing tables, even if it can't parse the DDL. If you need Superset to parse and manage DDL with ON CLUSTER, it would require changes to Superset's SQL parsing and/or sqlglot to gracefully handle or ignore the OnCluster node.

There are no existing issues or discussions in the repo that address this specific problem. If you need to automate DDL creation via Superset, you may need to wait for upstream support or consider contributing a patch to handle ON CLUSTER in the parsing layer.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

Comment From: zhangdong1015

In version 4.0 it worked. Is this issue a new bug?@dosu.