[SIP] Proposal for Enhanced OAuth2 Access Token Management UI and Performance Optimization in Apache Superset

Motivation

While attempting to ingest Apache Superset metadata using a third-party tool, I observed that access tokens can only be generated via the Swagger API and not through the Superset UI. Adding a UI component for token generation would significantly improve usability and accessibility.

Additionally, when running Superset using the Docker Compose deployment, I noticed excessive memory usage. This could be investigated and optimized to improve overall performance and reduce resource consumption.

as you can see here, only Superset services are using nearly 3 GB of RAM

Image

Proposed Change

Propose a UI for managing access tokens, enabling users to easily generate and manage tokens, as well as benefit from the available endpoints that use them. If possible, allow for the creation of multiple tokens with different permissions to support flexible access control.

Regarding the official Superset images, it could be part of a broader plan to improve performance and reduce high system usage. While it may be a significant undertaking, it is a valuable consideration worth pursuing.

Implement a monitoring service to check Superset's performance and enhance the performance of certain images by switching to a lighter Linux base image for example.

New or Changed Public Interfaces

As far as I know, no change is needed.

New dependencies

cryptography Python library: this is essential for securely encrypting and decrypting OAuth2 access and refresh tokens stored in the new database table, ensuring tokens are protected at rest.

Migration Plan and Compatibility

Add new table

table: database_user_oauth2_tokens

Column Name Data Type Description
id INTEGER primary key (auto-incrementing)
user_id INTEGER foreign key linking to the user
database_id INTEGER foreign key linking to the database
access_token TEXT encrypted OAuth2 access token
refresh_token TEXT encrypted OAuth2 refresh token
expires_at TIMESTAMP expiry time of the access token
created_at TIMESTAMP time when the token entry was created
updated_at TIMESTAMP time when the token entry was last updated

Notes: - both access_token and refresh_token must be encrypted at rest... - the expires_at column enables Superset to refresh tokens before they expire. - created_at and updated_at assist with auditing and tracking changes over time.

Rejected Alternatives

storing tokens only in memory or ephemeral cache: this approach was rejected because it does not persist tokens across Superset restarts causing repeated OAuth2 authorization flows and leading to a poor user experience. persistent storage in a dedicated encrypted database table is necessary to securely manage token lifecycle and enable seamless access.