Bug description
The bug: Superset sends, via email, broken CSV reports if the chart in question, is a table with server pagination enabled.
Steps to recreate: 1. Create a chart of the table type, with server pagination enabled 2. Save the chart 3. Schedule a report: pick the newly created chart, pick the content format - "send as CSV", configure email notification method and save 4. Wait for it to trigger 5. Find a single, "broken" CSV in your email 6. Not all tools are able to open the CSV (Google Drive and sheets manages to open it)
The CSV looks like this (preview from Google Drive):
What I was able to figure out:
- This only happens when you get a CSV from Superset through the reporting functionality
- Does not happen when, in a chart, be it in the chart editor or a in a dashboard, clicking in Download -> Export as CSV; there you get a .zip file with two .csv files, one with the data in the table, a second one with a row count
- If you change the file extension from .csv to .zip, from the CSV file you get from the email report, it actually is a .zip with two .csv files in it (same .zip file you would get from Download -> Export as CSV)
Proof: https://github.com/user-attachments/assets/5537ab8d-217c-4bf9-af30-f831b4e8c76c
rpedro@NULL:~/Downloads$ cat test\ bug_\ test\ -\ bug\ -\ broken\ csv\ report.csv
PK!"���
query_1.csv,partition_date,metadata.service_version
0,2025-01-23,6.1.3-rc.2
1,2025-01-29,6.1.4-rc.6
2,2025-01-29,6.1.4-rc.6
3,2025-01-29,6.1.4-rc.6
4,2025-01-29,6.1.4-rc.6
5,2025-01-29,6.1.4-rc.6
6,2025-01-29,6.1.4-rc.6
7,2025-01-29,6.1.4-rc.6
8,2025-01-29,6.1.4-rc.6
9,2025-01-27,6.1.4-rc.1
PK!ҢK
query_2.csv,rowcount
0,27820
PK!"���
�query_1.csvPK!ҢK
�Bquery_2.csvPKr}
Is this the intended behavior on this particular use of reports? Should it consider more exporting options (e.g. include row count flag, send as zip file)?
Notes: - I've only tested the email notification method. - In order to mitigate the problem, I've removed server pagination and set a high LIMIT value (but would like to run this with no LIMIT)
Screenshots/recordings
No response
Superset version
4.1.2
Python version
3.10
Node version
I don't know
Browser
Firefox
Additional context
No response
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
Hey @ruifpedro! 👋 I'm here to help you with any bugs, questions, or contributions you have for Superset. Let's squash those bugs together!
I found a similar open issue that might be related to the problem you're experiencing. The issue is that when server pagination is enabled, Superset can result in multiple queries being executed, which leads to the results being bundled into a ZIP file. However, the email report attachment is still being labeled with a .csv
extension instead of .zip
. This is likely why you're seeing a "broken" CSV file that actually contains a ZIP archive with multiple CSV files inside.
The suggested solution involves modifying the logic that assigns the file extension in the Content-Disposition
header of the email attachment. The relevant code for setting the file extension is located in the superset/utils/core.py
file, where the filename
parameter is set. You may need to adjust the code to correctly set the file extension based on whether the results are being compressed into a ZIP file or not [1].
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Comment From: rusackas
Closing in favor of https://github.com/apache/superset/issues/32858 - this seems to be a duplicate - but correct me if I'm mistaken!