Bug description
Hi All
We are creating a superset dataset and creating a chart from the dataset, we are creating a dataset from the parquet file, using duckdb engine.
When our parquet file is empty (But columns exists) and we are creating the dataset using the empty parquet. We use this dataset to create a chart Once chart is created, chart will show the error saying that columns are missing. Because my parquet file is empty, superset is not able to read the column, if parquet/dataset records are zero.
Please help us how we can fix this issue in superset.
Please find the screenshot for referance
Screenshots/recordings
No response
Superset version
4.1.2
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [ ] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
This happens because Superset relies on the database engine (in this case, DuckDB via SQLAlchemy) to provide column metadata. When your Parquet file is empty, DuckDB/SQLAlchemy may not return any columns, so Superset treats the dataset as having no columns, which leads to the error when building charts. This is a known limitation and has been discussed in the community—other users have run into similar issues when syncing columns for datasets that return no data [discussion].
A common workaround is to temporarily add a dummy row to your Parquet file or use a fake query that always returns at least one row, sync the columns in Superset, and then revert to your original (empty) dataset [issue]. Superset currently does not have built-in logic to infer or synthesize columns from an empty table; it simply exposes what the database engine reports [code reference].
If you need to automate this, you could script the addition and removal of a dummy row as part of your data pipeline. Otherwise, keep an eye on updates to DuckDB and Superset, as improvements in schema introspection for empty files may be added in the future.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other