[SIP] Proposal for Enhanced OAuth2 Access Token Management UI and Performance Optimization in Apache Superset
Motivation
While attempting to ingest Apache Superset metadata using a third-party tool, I observed that access tokens can only be generated via the Swagger API and not through the Superset UI. Adding a UI component for token generation would significantly improve usability and accessibility.
Additionally, when running Superset using the Docker Compose deployment, I noticed excessive memory usage. This could be investigated and optimized to improve overall performance and reduce resource consumption.
as you can see here, only Superset services are using nearly 3 GB of RAM
Proposed Change
Propose a UI for managing access tokens, enabling users to easily generate and manage tokens, as well as benefit from the available endpoints that use them. If possible, allow for the creation of multiple tokens with different permissions to support flexible access control.
Regarding the official Superset images, it could be part of a broader plan to improve performance and reduce high system usage. While it may be a significant undertaking, it is a valuable consideration worth pursuing.
Implement a monitoring service to check Superset's performance and enhance the performance of certain images by switching to a lighter Linux base image for example.
New or Changed Public Interfaces
As far as I know, no change is needed.
New dependencies
cryptography Python library: this is essential for securely encrypting and decrypting OAuth2 access and refresh tokens stored in the new database table, ensuring tokens are protected at rest.
Migration Plan and Compatibility
Add new table
table: database_user_oauth2_tokens
Column Name | Data Type | Description |
---|---|---|
id |
INTEGER | primary key (auto-incrementing) |
user_id |
INTEGER | foreign key linking to the user |
database_id |
INTEGER | foreign key linking to the database |
access_token |
TEXT | encrypted OAuth2 access token |
refresh_token |
TEXT | encrypted OAuth2 refresh token |
expires_at |
TIMESTAMP | expiry time of the access token |
created_at |
TIMESTAMP | time when the token entry was created |
updated_at |
TIMESTAMP | time when the token entry was last updated |
Notes:
- both access_token
and refresh_token
must be encrypted at rest...
- the expires_at
column enables Superset to refresh tokens before they expire.
- created_at
and updated_at
assist with auditing and tracking changes over time.
Rejected Alternatives
storing tokens only in memory or ephemeral cache: this approach was rejected because it does not persist tokens across Superset restarts causing repeated OAuth2 authorization flows and leading to a poor user experience. persistent storage in a dedicated encrypted database table is necessary to securely manage token lifecycle and enable seamless access.