Bug description

When upgrading on Kubernetes from version 1.5.3 to 4.1.1, I run the superset db upgrade command in an init job and receive a ERROR [flask_migrate] Error: maximum recursion depth exceeded while calling a Python object during migration a9422eeaae74, new_dataset_models_take_2. This appears to happen when the number of datasets stored in the metadata database is around 200. It worked with just a few datasets.

Reproduction Steps:

  1. Deploy Superset version 1.5.3 on Kubernetes
  2. Import 200 datasets
  3. Deploy upgraded Docker image version 4.1.1
  4. Run superset db upgrade

Expected Result: Database migrations run successfully.

Screenshots/recordings

No response

Superset version

4.1.1

Python version

3.11

Node version

Not applicable

Browser

Not applicable

Additional context

Logs from init script:

[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: pip install --upgrade pip
ERROR: You must give at least one requirement to install (see "pip help install")
Upgrading DB schema...
Loaded your LOCAL configuration at [/app/pythonpath/superset_config.py]
2025-02-21 11:39:15,227:INFO:superset.initialization:Setting database isolation level to READ COMMITTED
/usr/local/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
  warnings.warn(
INFO  [alembic.env] Starting the migration scripts.
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade 58df9d617f14 -> 2ed890b36b94, rm_time_range_endpoints_from_qc
INFO  [alembic.runtime.migration] Running upgrade 2ed890b36b94 -> b0d0249074e4, deprecate time_range_endpoints v2
INFO  [alembic.runtime.migration] Running upgrade 2ed890b36b94 -> 8b841273bec3, sql_lab_models_database_constraint_updates
INFO  [alembic.runtime.migration] Running upgrade 8b841273bec3, b0d0249074e4 -> 9d8a8d575284, merge point
INFO  [alembic.runtime.migration] Running upgrade 9d8a8d575284 -> cecc6bf46990, rm_time_range_endpoints_2
INFO  [alembic.runtime.migration] Running upgrade cecc6bf46990 -> ad07e4fdbaba, rm_time_range_endpoints_from_qc_3
INFO  [alembic.runtime.migration] Running upgrade ad07e4fdbaba -> a9422eeaae74, new_dataset_models_take_2
ERROR [flask_migrate] Error: maximum recursion depth exceeded while calling a Python object
slices updated with no time_range_endpoints: 0
>> Copy 10 physical tables to sl_tables...
>> Copy 202 SqlaTable to sl_datasets...
   Copy dataset owners...
   Link physical datasets with tables...
>> Copy 2,491 table columns to sl_columns...
   Link all columns to sl_datasets...
>> Copy 207 metrics to sl_columns...
   Link metric columns to datasets...
>> Run postprocessing on 2,698 columns
   [Column 1 to 2,698] 2,493 may be updated
   Assign table column relations...
>> Run postprocessing on 202 datasets
   Process dataset 1~202...                              
Initializing roles...
Loaded your LOCAL configuration at [/app/pythonpath/superset_config.py]
2025-02-21 11:39:25,436:INFO:superset.initialization:Setting database isolation level to READ COMMITTED
/usr/local/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
  warnings.warn(
2025-02-21 11:39:29,676:INFO:superset.security.manager:Syncing role definition
2025-02-21 11:39:29,811:INFO:superset.security.manager:Syncing Admin perms
2025-02-21 11:39:29,821:INFO:superset.security.manager:Syncing Alpha perms
2025-02-21 11:39:29,841:INFO:superset.security.manager:Syncing Gamma perms
2025-02-21 11:39:29,878:INFO:superset.security.manager:Syncing sql_lab perms
2025-02-21 11:39:29,928:INFO:superset.security.manager:Fetching a set of all perms to lookup which ones are missing
2025-02-21 11:39:29,953:INFO:superset.security.manager:Creating missing datasource permissions.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedColumn: column tables.catalog_perm does not exist
LINE 1: ...s_perm, tables.schema_perm AS tables_schema_perm, tables.cat...
                                                             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/utils/decorators.py", line 259, in wrapped
    result = func(*args, **kwargs)
  File "/app/superset/cli/main.py", line 68, in init
    security_manager.sync_role_definitions()
  File "/app/superset/security/manager.py", line 1070, in sync_role_definitions
    self.create_missing_perms()
  File "/app/superset/security/manager.py", line 1018, in create_missing_perms
    datasources = SqlaTable.get_all_datasources()
  File "/app/superset/connectors/sqla/models.py", line 1955, in get_all_datasources
    return qry.all()
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2773, in all
    return self._iter().all()
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2916, in _iter
    result = self.session.execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1717, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1710, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1577, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1953, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2134, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column tables.catalog_perm does not exist
LINE 1: ...s_perm, tables.schema_perm AS tables_schema_perm, tables.cat...
                                                             ^

[SQL: SELECT tables.uuid AS tables_uuid, tables.created_on AS tables_created_on, tables.changed_on AS tables_changed_on, tables.id AS tables_id, tables.description AS tables_description, tables.default_endpoint AS tables_default_endpoint, tables.is_featured AS tables_is_featured, tables.filter_select_enabled AS tables_filter_select_enabled, tables."offset" AS tables_offset, tables.cache_timeout AS tables_cache_timeout, tables.params AS tables_params, tables.perm AS tables_perm, tables.schema_perm AS tables_schema_perm, tables.catalog_perm AS tables_catalog_perm, tables.is_managed_externally AS tables_is_managed_externally, tables.external_url AS tables_external_url, tables.table_name AS tables_table_name, tables.main_dttm_col AS tables_main_dttm_col, tables.database_id AS tables_database_id, tables.fetch_values_predicate AS tables_fetch_values_predicate, tables.schema AS tables_schema, tables.catalog AS tables_catalog, tables.sql AS tables_sql, tables.is_sqllab_view AS tables_is_sqllab_view, tables.template_params AS tables_template_params, tables.extra AS tables_extra, tables.normalize_columns AS tables_normalize_columns, tables.always_filter_main_dttm AS tables_always_filter_main_dttm, tables.created_by_fk AS tables_created_by_fk, tables.changed_by_fk AS tables_changed_by_fk 
FROM tables 
WHERE tables.is_sqllab_view = false]
(Background on this error at: https://sqlalche.me/e/14/f405)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/superset", line 8, in <module>
    sys.exit(superset())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/flask/cli.py", line 358, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/flask/cli.py", line 358, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/app/superset/utils/decorators.py", line 266, in wrapped
    return on_error(ex)
  File "/app/superset/utils/decorators.py", line 234, in on_error
    raise reraise() from ex
sqlalchemy.exc.SQLAlchemyError: ()

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

Hey @alexburgoon! I'm here to help you with any bugs, questions, or contributions. Let me know how I can assist you!

A similar issue was reported in the past regarding a failure during database migration, specifically related to the new_dataset_models_take_2 migration step. The issue was titled "superset-init-db pod failed on upgrade" and has been closed. It was resolved by a pull request that addressed an error on PostgreSQL during the migration process [1].

Given the similarity, it might be helpful to review the changes made in that pull request to see if they can be applied or adapted to your situation. Additionally, ensure that your environment and dependencies are correctly set up, as discrepancies in configurations or versions might also contribute to such issues.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

Comment From: sfirke

As a starting point could you try migrating to 2.0.0, then 3.0.0, etc. ? Perhaps that would succeed and if not, perhaps it would narrow down the problem. Just an idea.

Comment From: alexburgoon

It turns out there was one dataset causing the recursion error. The SQL wasn't that complicated but once we deleted that the upgrade went through. However, we did run into another issue upgrading a different environment. This time the SQL parser throw a syntax error on some Jinja block {% ... %}, which I thought was allowed. As a workaround, I temporarily installed sqloxide before running the database migrations. This caused it to regex sub out the jinja blocks and everything to go through.

Comment From: rusackas

Cool! Welcome to the modern Superset world, then! Are there any docs that you think we should add to the docs site to cover upgrade snafus like this? We'd welcome suggestions/edits.

Otherwise, should we consider this issue "case closed?"

Comment From: rusackas

We haven't heard back, so "case closed" it is!