Bug description
## Bug Report: 401 Unauthorized for GuestToken Users When Using Celery Async Queries
### Description
Guest users accessing chart data via GuestToken encounter 401 Unauthorized errors when Celery asynchronous queries are enabled. The issue disappears when Celery async is disabled, confirming a conflict between Celery task processing and GuestToken authentication.
### Reproduction Steps
1. Configure Superset 5.0 with Celery for async queries
2. Generate a valid GuestToken with dashboard access permissions
3. Access a dashboard with uncached chart data using the GuestToken
4. Observe 401 errors in network requests to `/api/v1/chart/data`
5. Disable Celery async queries and verify the issue is resolved
### Expected Behavior
GuestToken users should be able to load both cached and uncached chart data when Celery async is enabled, maintaining the same access as in synchronous mode.
### Actual Behavior
- Cached data loads successfully
- Uncached data returns 401 Unauthorized when using Celery async
- All data loads correctly with Celery async disabled
### Root Cause Analysis
The issue stems from incomplete user context propagation in Celery worker processes:
1. **Celery Task User Loading**: The `_load_user_from_job_metadata` function in `superset/tasks/async_queries.py` only loads users via `user_id`, which isn't available for GuestToken users:
```python
def _load_user_from_job_metadata(job_metadata: dict[str, Any]) -> User:
if user_id := job_metadata.get("user_id"):
user = security_manager.find_user(user_id)
if user:
return user
# GuestToken users get treated as anonymous here
user = security_manager.get_anonymous_user()
return user
```
2. **Permission Check Failure**: When GuestToken context is lost, the `query_context_modified` security check in `superset/security/manager.py` incorrectly identifies the user as unauthorized:
```python
if (
query_context
and self.is_guest_user()
and query_context_modified(query_context)
):
raise SupersetSecurityException(...)
```
3. **Synchronous vs Asynchronous Handling**: Synchronous requests maintain complete GuestToken context, passing all security checks, while asynchronous requests lose this context.
### Proposed Fix
Modify Celery task processing to properly handle GuestTokens:
1. Extend `_load_user_from_job_metadata` to handle GuestTokens:
```python
def _load_user_from_job_metadata(job_metadata: dict[str, Any]) -> User:
if user_id := job_metadata.get("user_id"):
user = security_manager.find_user(user_id)
if user:
return user
# Add GuestToken handling
if guest_token := job_metadata.get("guest_token"):
try:
token = security_manager.parse_jwt_guest_token(guest_token)
return security_manager.get_guest_user_from_token(token)
except Exception:
pass
return security_manager.get_anonymous_user()
```
2. Ensure GuestToken is passed in async job metadata from API requests
3. Add configuration option to disable async queries for Guest users:
```python
def is_async_enabled_for_user(user):
if security_manager.is_guest_user(user):
return False
return is_feature_enabled("GLOBAL_ASYNC_QUERIES")
```
### Environment
- Superset Version: 5.0
- Celery Version: [Your Celery version]
- Authentication: GuestToken/JWT
Screenshots/recordings
No response
Superset version
5.0.0
Python version
3.11
Node version
18 or greater
Browser
Chrome
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
This bug has been confirmed and was recently fixed in a merged pull request (#30224). The root cause was that the Celery async job sometimes lacked the guest_token in its metadata, causing the worker to treat GuestToken users as anonymous and fail permission checks. The fix ensures that when a GuestToken user triggers an async task, the guest_token is included in the job metadata and used to reconstruct the correct user context in the Celery worker.
The current implementation of _load_user_from_job_metadata
already supports loading users by both user_id and guest_token, so if the guest_token is present, GuestToken users are handled correctly (source). The async query manager also ensures the guest_token is passed when needed (source).
There is no configuration option or feature flag to disable async queries specifically for GuestToken users as of Superset 5.0.
To resolve this, upgrade to a Superset version that includes the fix from PR #30224, or backport the relevant changes if you cannot upgrade. For reference, a similar issue and workaround for embedded dashboards with async queries is discussed in this issue.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other