Datasets can not be updated when they point to different tables that have the same name.
How to reproduce the bug
- Go to Datasets
- Click on + Dataset
- Select a database
- Select a schema
- Select a table (i.e. "report")
- Click on Create Dataset and Create Chart
- Go to Datasets again
- Click on + Dataset
- Select a database
- Select a schema, different from step 4
- Select a table, different from the one selected in step 5 but select one with the same name (i.e. "report")
- Click on Create Dataset and Create Chart (This step only works when the metastore database is not SQLite)
- Go to Datasets
- Click on Edit button (pencil icon) next to any dataset created in the previous steps
- Click on Save and then on Ok
Expected results
Successful message is displayed: "The dataset has been saved".
Actual results
An error message is displayed: "Dataset {dataset name} already exists".
Screenshots
Environment
- browser type and version: Google Chrome Version 114.0.5735.133 (Official Build) (arm64)
- superset version: Superset 0.0.0-dev
- python version: Python 3.9.16
- node.js version: (unsure about this, I'm using the docker image https://hub.docker.com/r/apache/superset)
- any feature flags active: None
Checklist
Make sure to follow these steps before submitting your issue - thank you!
- [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [x] I have reproduced the issue with at least the latest released version of superset.
- [x] I have checked the issue tracker for the same issue and I haven't found one similar.
Additional context
The Database Connection used is a Google Bigquery one.
curl request extracted from the browser network tab:
curl 'http://[REDACTED]/api/v1/dataset/2' \
-X 'PUT' \
-H 'Accept: application/json' \
-H 'Accept-Language: en-US,en;q=0.9' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Cookie: session=[REDACTED]' \
-H 'Origin: http://34.72.116.158' \
-H 'Referer: http://34.72.116.158/tablemodelview/list/?pageIndex=0&sortColumn=schema&sortOrder=asc' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36' \
-H 'X-CSRFToken: [REDACTED]' \
--data-raw '{
"table_name": "combined_report",
"database_id": 1,
"sql": null,
"filter_select_enabled": true,
"fetch_values_predicate": null,
"schema": "[REDACTED]",
"description": null,
"main_dttm_col": "date",
"offset": 0,
"default_endpoint": null,
"cache_timeout": null,
"is_sqllab_view": false,
"template_params": null,
"extra": null,
"is_managed_externally": false,
"metrics": [
{
"expression": "COUNT(*)",
"description": null,
"metric_name": "count",
"metric_type": "count",
"d3format": null,
"verbose_name": "COUNT(*)",
"warning_text": null,
"extra": "{}",
"id": 2
}
],
"columns": [
{
"id": 14,
"column_name": "account_id",
"type": "STRING",
"advanced_data_type": null,
"verbose_name": null,
"description": null,
"expression": null,
"filterable": true,
"groupby": true,
"is_active": true,
"is_dttm": false,
"python_date_format": null,
"uuid": "[REDACTED]",
"extra": "{}"
}
],
"owners": [
1
]
}
' \
--compressed \
--insecure
response extracted from the browser newtork tab:
422 UNPROCESSABLE ENTITY {"message":{"table_name":["Dataset combined_report already exists"]}}Comment From: Zlata-Inc
Hi!
I have the same error when i want to save dataset after syncing a new column.
Comment From: zowen-ch
I have the same issue as well.
Comment From: rusackas
@zowen-ch what version of Superset are you using? The rest of this thread might be using Superset 2.x, which isn't supported any more (we're on 3.1, and almost to 4.0)
Comment From: zowen-ch
Thank you for having a look @rusackas! Our instance is currently 3.0.0. If you think 3.1 would solve the issue, I will work to get it updated. But yes, I'm experiencing this error where repeated table names from different schemas are preventing column refreshes with the same error modal as above.
Comment From: rusackas
I don't currently have access to any databases with identical table names under different schemas. It would be appreciated if anyone can test this on a current 3.1.x or 4.x release, otherwise we might close it as stale. CC @sadpandajoe @michael-s-molina in case they have the means to test this.
Comment From: jelu
Can't even create 2 dataset where the table name is the same but it's in different databases:
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.IntegrityError) UNIQUE constraint failed: tables.table_name
[SQL: INSERT INTO tables (uuid, created_on, changed_on, description, default_endpoint, is_featured, filter_select_enabled, "offset", cache_timeout, params, perm, schema_perm, is_managed_externally, external_url, table_name, main_dttm_col, database_id, fetch_values_predicate, schema, sql, is_sqllab_view, template_params, extra, normalize_columns, always_filter_main_dttm, created_by_fk, changed_by_fk) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
$ pip list|grep apache-superset
apache-superset 4.0.1
Ps, I would really like to do this tho!
Comment From: pareis
I'm still having the problem in 4.0.2
Comment From: rusackas
This has been silent for quite a while now. Is it still an issue in 4.1.2/5.0.0?
Comment From: rusackas
We haven't heard back if this is still an issue in current releases. Let us know, or it might be closed as inactive/not-planned. Thanks!