Bug description
Description
We have a Snowflake connection that works fine in SQL Lab (queries succeed). When creating a new dataset or running “Test Connection,” Superset fails with:
An Error Occurred
Unable to load columns for the selected table. Please select a different table.
Logs show underlying errors from the Snowflake connector when trying to fetch staged results:
HTTPSConnectionPool(host='<Snowflake staging S3 bucket in us-west-2>', port=443):
Max retries exceeded … Remote end closed connection without response
Screenshots
Repro steps
- Go to Datasets
- Click + Dataset
- Pick Snowflake DB connection
- Choose schema + table
- Observe error
Expected
Columns load, Test Connection succeeds.
Actual
- Dataset creation fails.
- Test Connection fails.
- SQL Lab queries continue to work (likely because small results don’t hit S3 staging).
Environment
- Superset 4.1.1 (dockerized, AWS ECS Fargate)
- Python 3.9 (default in base image)
- Snowflake connector: 3.16.0
- Snowflake SQLAlchemy: 1.7.6
- Snowflake region: AWS us-west-2
Troubleshooting performed
- Confirmed Snowflake network policy allows our NAT egress IPs.
- Verified no failed logins in Snowflake login history (issue occurs before auth).
- Increased Superset/Gunicorn/ALB timeouts.
- Disabled proxy variables, set NO_PROXY for Snowflake/AWS domains.
- Tried connector options: ocsp_fail_open, insecure_mode, session params (CLIENT_PREFETCH_THREADS, CLIENT_RESULT_CHUNK_SIZE, USE_S3_REGIONAL_URL, etc.).
- Added curl sidecar: Snowflake account host + generic S3 reachable, but staging S3 bucket sometimes fails.
- Tested with minimal 40-row tables - still fails
- Confirmed issue is specific to metadata/reflection queries, not query size
Why this might be Superset-related
- SQL Lab queries work, but inspector/metadata queries consistently fail.
- Looks like a difference in how Superset uses the connector for reflection/metadata (larger results, staging) vs. SQL Lab.
- Want to confirm if this is a known issue with staging downloads in Superset’s Snowflake integration, and if there are recommended config flags / retries / version pins.
Screenshots/recordings
No response
Superset version
master / latest-dev
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
Hi @oliverbell-klv! I'm Dosu and I’m helping the superset team.
This is a known challenge with Superset's Snowflake integration: metadata reflection and dataset creation use SQLAlchemy's inspector methods (like get_columns
), which can trigger result sets that Snowflake stages in S3. If your infrastructure intermittently fails to access the staging S3 bucket, these operations will fail, even if SQL Lab queries work (since they often return small results directly and avoid S3 staging) source.
Superset does not have built-in retry logic or self-healing for Snowflake-specific network failures. Mitigation relies on tuning pool and timeout settings in the connector's configuration or your infrastructure setup. You can adjust these via the database's "Extra" field (engine_params
and connect_args
). Also, enabling the CLIENT_SESSION_KEEP_ALIVE
parameter in Snowflake may improve connection stability source.
If the issue persists, review your AWS/container networking and resource limits, and enable detailed logging (LOG_LEVEL = logging.DEBUG
) to capture more diagnostic info. There are no Superset-specific config flags or version pins that directly address S3 staging download errors in metadata reflection. This is primarily an infrastructure/networking issue surfaced by how Superset interacts with the Snowflake connector for metadata queries.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
Comment From: rusackas
Tempted to move this to a Help discussion. @dosu do you think there's a Superset bug here, or is this an infra/config issue?
Comment From: dosubot[bot]
Based on a deep review of Superset's code and recent changes, this isn't a Superset bug—it's an infrastructure or connector config issue. Superset uses SQLAlchemy inspector methods for metadata reflection, which can trigger Snowflake to stage results in S3. If your infra can't reliably access the staging S3 bucket, metadata queries (like dataset creation and column introspection) will fail, while SQL Lab queries may work because they often avoid S3 staging. Superset doesn't implement custom error handling or retries for these S3 staging errors; failures are surfaced directly from the Snowflake connector or network stack, not introduced by Superset itself. No recent PRs or docs indicate a regression or bug in Superset related to this behavior. Troubleshooting should focus on network reliability, connector options, and Snowflake session parameters. If new evidence emerges, it's worth revisiting, but right now this is infra/config, not a Superset bug.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other