Superset SQL Lab Editor cannot run Trino SQL routines which have multiple ; statements.

How to reproduce the bug

  1. Go to 'SQL Lab' (http://localhost:8088/superset/sqllab)
  2. Run a Trino SQL Routine with multiple ; characters, such as this one from example docs
WITH FUNCTION simple_case(a bigint)
RETURNS varchar
BEGIN
  CASE a
    WHEN 0 THEN RETURN 'zero';
    WHEN 1 THEN RETURN 'one';
    WHEN 10 THEN RETURN 'ten';
    WHEN 20 THEN RETURN 'twenty';
    ELSE RETURN 'other';
  END CASE;
  RETURN NULL;
END
SELECT simple_case(0);
  1. Sees error emssage

Expected results

SQL Routine to run successfully and return 0

Actual results

Website error message:

Trino Error
trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'

This may be triggered by:
Issue 1002 - The database returned an unexpected error. 

Superset Error message:

2023-12-01 12:52:33 Triggering query_id: 41
2023-12-01 12:52:33 2023-12-01 20:52:33,077:INFO:superset.sqllab.commands.execute:Triggering query_id: 41
2023-12-01 12:52:33 SQLite Database support for metadata databases will be removed             in a future version of Superset.
2023-12-01 12:52:33 2023-12-01 20:52:33,098:WARNING:superset.utils.celery:SQLite Database support for metadata databases will be removed             in a future version of Superset.
2023-12-01 12:52:33 Query 41: Executing 8 statement(s)
2023-12-01 12:52:33 2023-12-01 20:52:33,113:INFO:superset.sql_lab:Query 41: Executing 8 statement(s)
2023-12-01 12:52:33 Query 41: Set query to 'running'
2023-12-01 12:52:33 2023-12-01 20:52:33,113:INFO:superset.sql_lab:Query 41: Set query to 'running'
2023-12-01 12:52:33 Query 41: Running statement 1 out of 8
2023-12-01 12:52:33 2023-12-01 20:52:33,126:INFO:superset.sql_lab:Query 41: Running statement 1 out of 8
2023-12-01 12:52:33 2023-12-01 20:52:33,152:DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): host.docker.internal:8080
2023-12-01 12:52:33 2023-12-01 20:52:33,163:DEBUG:urllib3.connectionpool:http://host.docker.internal:8080 "POST /v1/statement HTTP/1.1" 200 328
2023-12-01 12:52:33 2023-12-01 20:52:33,176:DEBUG:urllib3.connectionpool:http://host.docker.internal:8080 "GET /v1/statement/queued/20231201_205233_00062_fcdsq/y76761f7eb08c441e8c30f713ad964721e8eb9a87/1 HTTP/1.1" 200 1063
2023-12-01 12:52:33 SupersetErrorsException
2023-12-01 12:52:33 Traceback (most recent call last):
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
2023-12-01 12:52:33     rv = self.dispatch_request()
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
2023-12-01 12:52:33     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 127, in wraps
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 121, in wraps
2023-12-01 12:52:33     duration, response = time_function(f, self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/core.py", line 1526, in time_function
2023-12-01 12:52:33     response = func(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 93, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/log.py", line 255, in wrapper
2023-12-01 12:52:33     value = f(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/sqllab/api.py", line 310, in execute_sql_query
2023-12-01 12:52:33     command_result: CommandResult = command.run()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 121, in run
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 103, in run
2023-12-01 12:52:33     status = self._run_sql_json_exec_from_scratch()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     return self._sql_json_executor.execute(
2023-12-01 12:52:33   File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
2023-12-01 12:52:33     raise SupersetErrorsException(
2023-12-01 12:52:33 superset.exceptions.SupersetErrorsException: [SupersetError(message="trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'", error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Trino', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2023-12-01 12:52:33 2023-12-01 20:52:33,191:WARNING:superset.views.base:SupersetErrorsException
2023-12-01 12:52:33 Traceback (most recent call last):
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
2023-12-01 12:52:33     rv = self.dispatch_request()
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
2023-12-01 12:52:33     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 127, in wraps
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 121, in wraps
2023-12-01 12:52:33     duration, response = time_function(f, self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/core.py", line 1526, in time_function
2023-12-01 12:52:33     response = func(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 93, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/log.py", line 255, in wrapper
2023-12-01 12:52:33     value = f(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/sqllab/api.py", line 310, in execute_sql_query
2023-12-01 12:52:33     command_result: CommandResult = command.run()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 121, in run
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 103, in run
2023-12-01 12:52:33     status = self._run_sql_json_exec_from_scratch()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     return self._sql_json_executor.execute(
2023-12-01 12:52:33   File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
2023-12-01 12:52:33     raise SupersetErrorsException(
2023-12-01 12:52:33 superset.exceptions.SupersetErrorsException: [SupersetError(message="trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'", error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Trino', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2023-12-01 12:52:33 192.168.65.1 - - [01/Dec/2023:20:52:33 +0000] "POST /api/v1/sqllab/execute/ HTTP/1.1" 500 337 "http://localhost:8088/superset/sqllab" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"
2023-12-01 12:52:33 192.168.65.1 - - [01/Dec/2023:20:52:33 +0000] "PUT /tabstateview/4 HTTP/1.1" 200 1 "http://localhost:8088/superset/sqllab" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"

Screenshots

Screenshot 2023-12-01 at 12 51 39 PM

Environment

(please complete the following information):

  • browser type and version: Chrome Version 119.0.6045.159 (Official Build) (arm64)
  • superset version: 3.0.1
  • python version: Python 3.9.18
  • node.js version: N/A
  • any feature flags active: N/A

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

I was able to track this down to the SQL statement parsing logic. Specifically, the db_engine_spec.run_multiple_statements_as_one variable is used to determine whether to parse SQL with ; characters into multiple SQL statements. The TrinoEngineSpec and PrestoBaseEngineSpec both do not include this variable. So run_multiple_statements_as_one is set to False by the BaseEngineSpec.

I was able to set run_multiple_statements_as_one to True in TrinoEngineSpec and got the SQL above to work. However, this changes the default behavior of SQL Lab for Trino which allows running multiple SQL statements separated by ;.

Comment From: mosabua

Thank you for filing this @kevinjqliu !

DBeaver has a similar problem. I expect multiple clients will fall over this problem since they assume ; to be the end of a statement..

For functions however it would be the END or the function for inline function (see your example

or for CREATE FUNCTION usage the END;

If anyone needs help with this logic.. ping me

https://github.com/dbeaver/dbeaver/issues/21808

Comment From: mosabua

Btw here is the logic to figure out the details from the Trino CLI https://github.com/trinodb/trino/blob/master/client/trino-cli/src/main/java/io/trino/cli/lexer/StatementSplitter.java

Comment From: rusackas

@betodealmeida do you think the sqlglot migration would help at all with this sort of issue?

Comment From: kevinjqliu

Seems like sqlglot can help solve this issue.

It doesn't work out of the box though.

from sqlglot import parse

sql = """FUNCTION simple_case(a bigint)
  RETURNS varchar
  BEGIN
    CASE a
      WHEN 0 THEN RETURN 'zero';
      WHEN 1 THEN RETURN 'one';
      ELSE RETURN 'more than one or negative';
    END CASE;
    RETURN NULL;
  END
""" 

parse(sql, dialect='trino')

Returns an error

sqlglot.errors.ParseError: Invalid expression / Unexpected token. Line 1, Col: 21.
  FUNCTION simple_case(a bigint)
  RETURNS varchar
  BEGIN
    CASE a
      WHEN 0 THEN RETURN 'zero';
      WHEN 1 THEN RE

Comment From: betodealmeida

Even if this is not supported by sqlglot it should work in Superset as is — at least if DML is enabled. Let me take a look.

Comment From: rusackas

We're no longer supporting 3.x, and this has been quiet for upward of a year. Tempted to close this as stale, unless it can be reproduced in 4.x or newer.

Comment From: kevinjqliu

not sure if this is still relevant, feel free to close it

Comment From: mosabua

Somebody would have to test - here are the docs https://trino.io/docs/current/udf/sql.html

Ideally you could then also test the newer Python UDF support - https://trino.io/docs/current/udf/python.html

Comment From: FurcyPin

Yes, this issue is still relevant.

Here is an example of SQL query that is valid in the latest version of Trino :

WITH FUNCTION meaning_of_life()
  RETURNS tinyint
  BEGIN
    DECLARE a tinyint DEFAULT CAST(6 as tinyint);
    DECLARE b tinyint DEFAULT CAST(7 as tinyint);
    RETURN a * b;
  END
SELECT meaning_of_life() 

(I took it from the official documentation, with a slight modification because I had to cast 6 and 7 into tinyints for it to work...)

As can be seen in the screenshots below, it works in Trino version 475 Image

But it does not work in Superset 5.0.0rc1

Image

The crash happens at this line, when sqlparse.parse is called, because sqlparse does not know how to parse procedural SQL (BEGIN ... END statements) correctly.

I think the right solution would be to fix the sqlparse library to prevent it from splitting BEGIN ... END statements.

I created a new issue in sqlparse's repo (https://github.com/andialbrecht/sqlparse/issues/809) and I can start working on a MR on the sqlparse repo next week:

P.S: While investigating, I also took a look at sqlglot, but sqlglot does not have a split method like sqlparse does. I started working on adding one, but if the issue get fixed in sqlparse it won't be necessary... unless if you want to rely solely on sqlglot and stop using sqlparse ?

The advantage of switching to sqlglot would be if some SQL dialects ever require an even more exotic way of splitting, it could be added there.

Comment From: betodealmeida

P.S: While investigating, I also took a look at sqlglot, but sqlglot does not have a split method like sqlparse does. I started working on adding one, but if the issue get fixed in sqlparse it won't be necessary... unless if you want to rely solely on sqlglot and stop using sqlparse ?

The advantage of switching to sqlglot would be if some SQL dialects ever require an even more exotic way of splitting, it could be added there.

We've moved on to sqlglot, and sqlparse is no longer used. See https://github.com/apache/superset/issues/26786.

I tested this with the latest sqlglot and it stlil doesn't work:

>>> import sqlglot
>>> sqlglot.__version__
'26.25.3'
>>> sqlglot.parse("""
... WITH FUNCTION meaning_of_life()
...   RETURNS tinyint
...   BEGIN
...     DECLARE a tinyint DEFAULT CAST(6 as tinyint);
...     DECLARE b tinyint DEFAULT CAST(7 as tinyint);
...     RETURN a * b;
...   END
... SELECT meaning_of_life()
... """, "trino")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/__init__.py", line 102, in parse
    return Dialect.get_or_raise(read or dialect).parse(sql, **opts)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/dialects/dialect.py", line 1010, in parse
    return self.parser(**opts).parse(self.tokenize(sql), sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 1584, in parse
    return self._parse(
           ^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 1653, in _parse
    expressions.append(parse_method(self))
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 1894, in _parse_statement
    expression = self._parse_set_operations(expression) if expression else self._parse_select()
                                                                           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 3180, in _parse_select
    cte = self._parse_with()
          ^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 3313, in _parse_with
    cte = self._parse_cte()
          ^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 3356, in _parse_cte
    this=self._parse_wrapped(self._parse_statement),
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 7189, in _parse_wrapped
    self.raise_error("Expecting (")
  File "/Users/beto/.pyenv/versions/3.12.4/envs/shillelagh-3.12/lib/python3.12/site-packages/sqlglot/parser.py", line 1697, in raise_error
    raise error
sqlglot.errors.ParseError: Expecting (. Line 2, Col: 29.

WITH FUNCTION meaning_of_life()
  RETURNS tinyint
  BEGIN
    DECLARE a tinyint DEFAULT CAST(6 as tinyint);
    DECLARE b tinyint

Comment From: betodealmeida

https://github.com/tobymao/sqlglot/issues/5178