Bug description

Queries run in SQL lab have a limit added to it. However when a query uses Common Table Expressions (CTE) it does not. E.g.

select * from some_db.some_table;

When you select a 10 rows limit from the dropdown, then 10 rows are fetched.

When you run

with data as (select * from some_db.some_table)
select * from data;

Then no limit is applied and the entire data set is fetched.

This can be problematic, e.g. in our case someone run a query on a big dataset which resulted in > 20 GB of data being loaded into memory which crashed the superset worker.

Screenshots/recordings

No response

Superset version

4.1.1

Python version

3.11

Node version

I don't know

Browser

Chrome

Additional context

No response

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: sadpandajoe

@betodealmeida do you know if cte works on the first portion (with data as (select * from some_db.some_table) of the query?

Comment From: betodealmeida

I'm rewriting this logic to use sqlglot, I'll make sure to add unit test covering it.

Comment From: rusackas

@betodealmeida did the sqlglot project fix this one?

Comment From: sadpandajoe

@rusackas looks like it could have: https://github.com/apache/superset/pull/33518