Bug description
This bug describes the cause for a bug previously described in #22467 and later closed without explicit resolution. It offers a possible patch, but not a full fix.
This is the issue I encountered in superset v. 4.1.1 using Duckdb.
In the superset database connection (edit database --> Advanced --> Other additional settings), I wanted to pass some engine parameters, however this was failing.
What I found is that the parameters weren't being decoded correctly as part of the get_extra_params()
method of the base database engine class. This meant that when duckdb.py calls that method to get the extra parameters, you don't get a properly formed nested dictionary, instead you get a dictionary with the first key and then all other info as a string for the value.
So, if my input was:
{'connect_args': {'read_only': True}}
superset is passing back:
{"engine_params":"{'connect_args': {'read_only': True}}"}
(Note the extra quotes around the value)
It should be passing:
{"engine_params":{'connect_args': {'read_only': True}}}
This means that duckdb.py is failing on its get_extra_params()
method because it's trying to operate on a string instead of a dictionary.
One solution that worked for me was to fix the immediate problem - this worked for my use case:
# /superset/db_engine_specs/duckdb.py
from ast import literal_eval
@staticmethod
def get_extra_params(
database: Database, source: QuerySource | None = None
) -> dict[str, Any]:
"""
Add a user agent to be used in the requests.
"""
extra: dict[str, Any] = BaseEngineSpec.get_extra_params(database)
# >>> Start new code
for key in extra.key():
try:
extra[key] = literal_eval(extra[key])
except ValueError:
pass
# >>> End new code
engine_params: dict[str, Any] = extra.setdefault("engine_params", {})
connect_args: dict[str, Any] = engine_params.setdefault("connect_args", {})
config: dict[str, Any] = connect_args.setdefault("config", {})
custom_user_agent = config.pop("custom_user_agent", "")
delim = " " if custom_user_agent else ""
user_agent = get_user_agent(database, source)
user_agent = user_agent.replace(" ", "-").lower()
user_agent = f"{user_agent}/{VERSION_STRING}{delim}{custom_user_agent}"
config.setdefault("custom_user_agent", user_agent)
return extra
However, as literal_eval
is just turning a presumed string into its representation (or passing if not a string) the best solution is probably to fix the root cause problem of the malformed json being passed in the first place. Unfortunately, I don't know the code base well enough to know where to make that change.
Also, I assume that this could be a problem for databases other than duckdb. Given that it appears to be a problem from the Database object and base database engine class.
Screenshots/recordings
No response
Superset version
master / latest-dev
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: danlewis85
@sapcode - might be of interest to you if still relevant.
Comment From: dosubot[bot]
Hey @danlewis85! I'm here to assist you with any bugs, questions, or contributions while you wait for a human maintainer. Let me know how I can help! Dosu
To address the issue of malformed JSON/dictionary in the get_extra_params()
method affecting the DuckDB connection, you should ensure that the extra
field of the database
object is correctly parsed as a JSON object before accessing it. The test case for the DuckDBEngineSpec
class demonstrates how the get_extra_params
method should handle the extra
field:
from superset.db_engine_specs.duckdb import DuckDBEngineSpec
database = mocker.MagicMock()
# Case when extra is an empty dictionary
database.extra = {}
assert DuckDBEngineSpec.get_extra_params(database) == {
"engine_params": {
"connect_args": {
"config": {"custom_user_agent": f"apache-superset/{VERSION_STRING}"}
}
}
}
# Case when extra is a JSON string
database.extra = json.dumps(
{"engine_params": {"connect_args": {"config": {"custom_user_agent": "my-app"}}}}
)
assert DuckDBEngineSpec.get_extra_params(database) == {
"engine_params": {
"connect_args": {
"config": {
"custom_user_agent": f"apache-superset/{VERSION_STRING} my-app"
}
}
}
}
Ensure that the extra
field is correctly parsed using json.loads()
if it is a JSON string. This will prevent issues with malformed JSON/dictionary structures [1].
Unfortunately, I didn't find any similar threads that might provide additional context or solutions.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Comment From: rusackas
@betodealmeida @guenp might have interest/advice here too.
Comment From: rusackas
Is this still an issue as of 5.0.0 or on current master
? We're certainly open to a PR and/or further investigation.