Bug description

Hello everyone. I'm working on a new db_engine_spec and I've found it very difficult. Dialect I'm trying to integrate with Superset does not support schemas. AFAIK it is not a "rare" scenario - but it is impossible to create physical dataset using table without selecting some schema (only virtual datasets based on queries are available). What can I do to bypass this issue? If there is a good and easy hack - I'd like to use it!

It would be nice if db_engine_spec would have supports_schemas = True field that we could override in our custom spec. SQLAlchemy have this field already https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_4/lib/sqlalchemy/engine/default.py#L72, so probably you can rely on it.

Screenshots/recordings

Apache Superset It is impossible to select table without selecting schema in

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

  • [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

Hello @vgvoleg! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's tackle this issue together!

To address the issue of creating a dataset without selecting a schema in Superset, you can consider the following approach:

  1. Custom DB Engine Spec: You can create a custom db_engine_spec for your database engine. In this custom spec, you can override methods to handle the absence of schemas. While Superset does not currently have a supports_schemas attribute in its db_engine_spec, you can implement logic in your custom spec to bypass schema requirements.

  2. Adjust Engine Parameters: Use the adjust_engine_params method in your custom db_engine_spec to modify the SQLAlchemy URI and connection arguments. This can help ensure that queries run correctly without requiring a schema.

  3. Modify Frontend Logic: If you have control over the frontend, you might consider modifying the logic in the DatabaseSelector component to handle cases where schemas are not applicable. This could involve conditionally rendering the schema selection UI based on the database engine's capabilities.

  4. SQLAlchemy's supports_schemas: Although Superset does not directly use SQLAlchemy's supports_schemas attribute, you can leverage this attribute in your custom spec to determine whether to enforce schema selection.

These steps should help you integrate a database engine that does not support schemas into Superset. If you need further customization, you might need to dive deeper into the Superset codebase to adjust how datasets are created and managed.

To continue talking to Dosu, mention @dosu.

Comment From: vgvoleg

More details were discussed in this thread: https://apache-superset.slack.com/archives/C014LS99C1K/p1732111768591139

Comment From: rusackas

Closing this, since you were kind enough to sort it out and add YDB for 5.0 in this PR! Thank you!

Comment From: vgvoleg

the problem is still exist, we are able to create only virtual datasets because of this

Comment From: rusackas

Thanks for the clarification. I don't think any of the usual-suspect committers have YDB access, so we'd welcome any support/PRs that you (or anyone!) can provide.

Comment From: betodealmeida

One workaround I've used in the past (for the GSheets DB engine spec) was to simply have a dummy schema called "main". You can have your DB engine spec have:

class MyDBEngineSpec(BaseEngineSpec):
    @classmethod
    def get_schema_names(cls, inspector: Inspector) -> set[str]:
        return {"main"}

Then just ignore the schemas as it's passed around. Of course ideally we wouldn't need this, and we would support supports_schemas = False. But from the start Superset was written assuming there is a schema and it's a string, so making it optional requires a lot of refactoring:

% git grep 'schema: str'  | grep -v None | grep -v test | nl
     1  superset-frontend/packages/superset-ui-core/src/ui-overrides/types.ts:  schema: string;
     2  superset-frontend/src/SqlLab/components/SqlEditorLeftBar/index.tsx:    (schema: string) => {
     3  superset-frontend/src/SqlLab/types.ts:  schema: string;
     4  superset-frontend/src/database/types.ts:  schema: string;
     5  superset-frontend/src/explore/components/controls/FilterControl/utils/translateToSQL.ts:    datasource: { schema: string; datasource_name: string };
     6  superset-frontend/src/features/databases/UploadDataModel/index.tsx:  schema: string;
     7  superset-frontend/src/features/datasets/AddDataset/DatasetPanel/index.tsx:  schema: string;
     8  superset-frontend/src/features/datasets/AddDataset/LeftPanel/index.tsx:  const setSchema = (schema: string) => {
     9  superset-frontend/src/features/datasets/DatasetSelectLabel/index.tsx:  schema: string;
    10  superset-frontend/src/features/datasets/hooks/useDatasetLists.ts:  schema: string | null | undefined,
    11  superset-frontend/src/hooks/apiResources/tables.ts:  schema: string;
    12  superset-frontend/src/pages/DatasetCreation/index.tsx:  schema: string;
    13  superset-frontend/src/pages/DatasetList/index.tsx:  schema: string;
    14  superset-frontend/src/types/Dataset.ts:  schema: string;
    15  superset-frontend/src/views/CRUD/types.ts:  schema: string;
    16  superset-frontend/src/views/CRUD/types.ts:  sql_tables?: { catalog?: string; schema: string; table: string }[];
    17  superset-frontend/src/views/CRUD/types.ts:  schema: string;
    18  superset-frontend/src/views/CRUD/types.ts:  sql_tables?: { catalog?: string; schema: string; table: string }[];
    19  superset/commands/sql_lab/estimate.py:    _schema: str
    20  superset/config.py:# def compute_schema_name(database: Database, user: User, schema: str, sql: str) -> str:
    21  superset/connectors/sqla/models.py:        schema: str,
    22  superset/databases/schemas.py:    referred_schema: str
    23  superset/db_engine_specs/base.py:        schema: str,
    24  superset/db_engine_specs/bigquery.py:        schema: str,
    25  superset/db_engine_specs/databricks.py:    default_schema: str
    26  superset/models/core.py:        schema: str,
    27  superset/models/core.py:        schema: str,
    28  superset/models/helpers.py:    default_schema: str,
    29  superset/models/helpers.py:        schema: str,
    30  superset/security/manager.py:    schema: str
    31  superset/sqllab/sqllab_execution_context.py:    schema: str
    32  superset/utils/core.py:    schema: str
    33  superset/utils/rls.py:    schema: str,
    34  superset/views/utils.py:    database: Database, user: ab_models.User, schema: str, sql: str