Bug description
Queries run in SQL lab have a limit added to it. However when a query uses Common Table Expressions (CTE) it does not. E.g.
select * from some_db.some_table;
When you select a 10 rows limit from the dropdown, then 10 rows are fetched.
When you run
with data as (select * from some_db.some_table)
select * from data;
Then no limit is applied and the entire data set is fetched.
This can be problematic, e.g. in our case someone run a query on a big dataset which resulted in > 20 GB of data being loaded into memory which crashed the superset worker.
Screenshots/recordings
No response
Superset version
4.1.1
Python version
3.11
Node version
I don't know
Browser
Chrome
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: sadpandajoe
@betodealmeida do you know if cte works on the first portion (with data as (select * from some_db.some_table
) of the query?
Comment From: betodealmeida
I'm rewriting this logic to use sqlglot
, I'll make sure to add unit test covering it.
Comment From: rusackas
@betodealmeida did the sqlglot project fix this one?
Comment From: sadpandajoe
@rusackas looks like it could have: https://github.com/apache/superset/pull/33518