Bug description
I have installed the latest version of Superset (4.0.1). However, when exporting query results to a CSV file, the Chinese characters and numbers in the file are always garbled. I have tried changing the encoding to utf-8, utf-8-sig, and gbk, but the issue persists.
How to reproduce the bug
Install Superset 4.0.1. Run a query that includes Chinese characters and numbers. Export the query results to a CSV file. Open the CSV file and observe that the Chinese characters and numbers are garbled.
Screenshots/recordings
No response
Superset version
4.0.1
Python version
3.11
Node version
20.12.2
Browser
Chrome
Additional context
Expected Behavior: The Chinese characters and numbers should be displayed correctly in the exported CSV file.
Actual Behavior: The Chinese characters and numbers are garbled in the exported CSV file.
Environment:
Superset version: 4.0.1 Encoding attempted: utf-8, utf-8-sig, gbk Additional Context: Please provide any additional information that might help in diagnosing the
Checklist
- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
Hey @TommyLeung-gj! I'm here to help you with the issue you're experiencing with Superset 4.0.1. I'm a bot that can assist you with bug fixes, answer your questions, and guide you on becoming a contributor. Let's dive into resolving the problem together!
To resolve the issue of Chinese characters and numbers being garbled when exporting query results to a CSV file in Superset 4.0.1, you need to ensure that the CSV file is encoded in UTF-8. The df_to_escaped_csv
function in the superset/utils/csv.py
file should be called with the appropriate encoding settings.
Here is the relevant part of the code with the necessary modification:
csv_data = df_to_escaped_csv(df, index=False, encoding='utf-8', **config["CSV_EXPORT"])
Ensure that the encoding='utf-8'
parameter is included in the call to df_to_escaped_csv
. This will help in correctly encoding Chinese characters and numbers in the exported CSV file [1][2].
To continue talking to Dosu, mention @dosu.
Comment From: rusackas
Related issue linked above ^^^
Might even be considered a duplicate, really, but they seem kind of unique.
Comment From: glenn-gj-liang
No ,It still doesn't work after I changed encoding = "utf-8"
Comment From: glenn-gj-liang
maybe there is something wrong in the process of async query , the celery or redis resultbackend cache
Comment From: bionexit
+1
Comment From: Habeeb556
+2
Having the same issue with VERSION 4.0.1
even VERSION 2.1.3
WORKING FINE WITH THESE PARAMETER CONFIG.
Comment From: Habeeb556
The issue has been resolved by downgrading the package with the following command: pip install Werkzeug==2.3.8
.
Comment From: bionexit
The issue has been resolved by downgrading the package with the following command:
pip install Werkzeug==2.3.8
.
I down grade the Werkzeug as 2.3.8 but no luck. What's your encoding option?
My is following
CSV_EXPORT = {"encoding": "utf-8-sig"}
Comment From: Habeeb556
The issue has been resolved by downgrading the package with the following command:
pip install Werkzeug==2.3.8
.I down grade the Werkzeug as 2.3.8 but no luck. What's your encoding option?
My is following
CSV_EXPORT = {"encoding": "utf-8-sig"}
Yes, this encoding. But did you face the same problem with VERSION 2.1.3
or 3.1.3
? Also I got a Chinese characters not with the English.
Comment From: bionexit
The issue has been resolved by downgrading the package with the following command:
pip install Werkzeug==2.3.8
.I down grade the Werkzeug as 2.3.8 but no luck. What's your encoding option? My is following
CSV_EXPORT = {"encoding": "utf-8-sig"}
Yes, this encoding. But did you face the same problem with
VERSION 2.1.3
or3.1.3
? Also I got a Chinese characters not with the English.
It's worked after i reload the cerely service. Thanks a lot bro.
Comment From: Habeeb556
++ @TommyLeung-gj, could you confirm if this downgrade solves your issue or not? Also, what language are you using?
++ @bionexit, we appreciate your feedback on the language characters you encountered, to report to the Werkzeug team.
Comment From: foretony5211
The issue has been resolved by downgrading the package with the following command:
pip install Werkzeug==2.3.8
.I down grade the Werkzeug as 2.3.8 but no luck. What's your encoding option? My is following
CSV_EXPORT = {"encoding": "utf-8-sig"}
Yes, this encoding. But did you face the same problem with
VERSION 2.1.3
or3.1.3
? Also I got a Chinese characters not with the English.It's worked after i reload the cerely service. Thanks a lot bro.
I use docker. How to reload the cerely service. thanks.
Comment From: wuqicyber
++ @TommyLeung-gj, could you confirm if this downgrade solves your issue or not? Also, what language are you using?
++ @bionexit, we appreciate your feedback on the language characters you encountered, to report to the Werkzeug team.
i've try downgraded Werkzeug==2.3.8, and it works, thanks
Comment From: ruifpedro
@glenn-gj-liang are you by chance downloading a CSV from a Table type chart which has server side paging enabled? I found that the server side pagination caused chinese characters / garbled characters to appear in the CSV file (will report this as a bug later).
Comment From: rusackas
Related, I think: https://github.com/apache/superset/pull/33720
Anyone able to reproduce thison 5.0.0 release candidates or on master
branch?