Datasets can not be updated when they point to different tables that have the same name.

How to reproduce the bug

  1. Go to Datasets
  2. Click on + Dataset
  3. Select a database
  4. Select a schema
  5. Select a table (i.e. "report")
  6. Click on Create Dataset and Create Chart
  7. Go to Datasets again
  8. Click on + Dataset
  9. Select a database
  10. Select a schema, different from step 4
  11. Select a table, different from the one selected in step 5 but select one with the same name (i.e. "report")
  12. Click on Create Dataset and Create Chart (This step only works when the metastore database is not SQLite)
  13. Go to Datasets
  14. Click on Edit button (pencil icon) next to any dataset created in the previous steps
  15. Click on Save and then on Ok

Expected results

Successful message is displayed: "The dataset has been saved".

Actual results

An error message is displayed: "Dataset {dataset name} already exists".

Screenshots

Apache Superset Can't update datasets with duplicated names

Environment

  • browser type and version: Google Chrome Version 114.0.5735.133 (Official Build) (arm64)
  • superset version: Superset 0.0.0-dev
  • python version: Python 3.9.16
  • node.js version: (unsure about this, I'm using the docker image https://hub.docker.com/r/apache/superset)
  • any feature flags active: None

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

The Database Connection used is a Google Bigquery one.

curl request extracted from the browser network tab:
curl 'http://[REDACTED]/api/v1/dataset/2' \
  -X 'PUT' \
  -H 'Accept: application/json' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Cookie: session=[REDACTED]' \
  -H 'Origin: http://34.72.116.158' \
  -H 'Referer: http://34.72.116.158/tablemodelview/list/?pageIndex=0&sortColumn=schema&sortOrder=asc' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36' \
  -H 'X-CSRFToken: [REDACTED]' \
  --data-raw '{
  "table_name": "combined_report",
  "database_id": 1,
  "sql": null,
  "filter_select_enabled": true,
  "fetch_values_predicate": null,
  "schema": "[REDACTED]",
  "description": null,
  "main_dttm_col": "date",
  "offset": 0,
  "default_endpoint": null,
  "cache_timeout": null,
  "is_sqllab_view": false,
  "template_params": null,
  "extra": null,
  "is_managed_externally": false,
  "metrics": [
    {
      "expression": "COUNT(*)",
      "description": null,
      "metric_name": "count",
      "metric_type": "count",
      "d3format": null,
      "verbose_name": "COUNT(*)",
      "warning_text": null,
      "extra": "{}",
      "id": 2
    }
  ],
  "columns": [
    {
      "id": 14,
      "column_name": "account_id",
      "type": "STRING",
      "advanced_data_type": null,
      "verbose_name": null,
      "description": null,
      "expression": null,
      "filterable": true,
      "groupby": true,
      "is_active": true,
      "is_dttm": false,
      "python_date_format": null,
      "uuid": "[REDACTED]",
      "extra": "{}"
    }
  ],
  "owners": [
    1
  ]
}
' \
  --compressed \
  --insecure
response extracted from the browser newtork tab: 422 UNPROCESSABLE ENTITY {"message":{"table_name":["Dataset combined_report already exists"]}}

Comment From: Zlata-Inc

Hi!

I have the same error when i want to save dataset after syncing a new column.

Comment From: zowen-ch

I have the same issue as well.

Comment From: rusackas

@zowen-ch what version of Superset are you using? The rest of this thread might be using Superset 2.x, which isn't supported any more (we're on 3.1, and almost to 4.0)

Comment From: zowen-ch

Thank you for having a look @rusackas! Our instance is currently 3.0.0. If you think 3.1 would solve the issue, I will work to get it updated. But yes, I'm experiencing this error where repeated table names from different schemas are preventing column refreshes with the same error modal as above.

Comment From: rusackas

I don't currently have access to any databases with identical table names under different schemas. It would be appreciated if anyone can test this on a current 3.1.x or 4.x release, otherwise we might close it as stale. CC @sadpandajoe @michael-s-molina in case they have the means to test this.

Comment From: jelu

Can't even create 2 dataset where the table name is the same but it's in different databases:

sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.IntegrityError) UNIQUE constraint failed: tables.table_name
[SQL: INSERT INTO tables (uuid, created_on, changed_on, description, default_endpoint, is_featured, filter_select_enabled, "offset", cache_timeout, params, perm, schema_perm, is_managed_externally, external_url, table_name, main_dttm_col, database_id, fetch_values_predicate, schema, sql, is_sqllab_view, template_params, extra, normalize_columns, always_filter_main_dttm, created_by_fk, changed_by_fk) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
$ pip list|grep apache-superset
apache-superset           4.0.1

Ps, I would really like to do this tho!

Comment From: pareis

I'm still having the problem in 4.0.2

Comment From: rusackas

This has been silent for quite a while now. Is it still an issue in 4.1.2/5.0.0?

Comment From: rusackas

We haven't heard back if this is still an issue in current releases. Let us know, or it might be closed as inactive/not-planned. Thanks!