Bug description
Hi everyone, I need your help.
When I try to import a CSV dataset via the superset API with the endpoint /api/v1/dataset/import/, the request response is {"message": "OK"}, which indicates that everything went well. The problem is that the dataset has not been imported, and the following messages appear in the logs:
/app/superset/commands/importers/v1/utils.py:113: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)
for uuid, password in db.session.query(Database.uuid, Database.password).all()
/app/superset/commands/importers/v1/utils.py:118: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)
for uuid, password in db.session.query(SSHTunnel.uuid, SSHTunnel.password).all()
/app/superset/commands/importers/v1/utils.py:125: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)
).all()
/app/superset/commands/importers/v1/utils.py:132: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)
).all()
I'm importing without using an SSH tunnel. Is the use of an SSH tunnel mandatory? If not, how can I fix this problem?
Thank you in advance for your help.
How to reproduce the bug
- Call endpoint
/api/v1/dataset/import/
with parametersformData
andpasswords
. Not use the ssh tunnel informations - One authenticated, add the bearer token and the csrf token, then launch the test
- You will receive {"message": "OK"} even if the dataset is not imported
Screenshots/recordings
example_import_dataset_superset.zip
Superset version
4.0.1
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Comment From: dosubot[bot]
Hey @kdaouda 👋, I'm here to help you with your Superset issue. I'm a bot that can assist you with bugs, questions, and becoming a contributor. Let's squash this bug together! 🤖
To fix the issue where the Superset API endpoint /api/v1/dataset/import/
returns {"message": "OK"}
but does not actually import the dataset, and the logs show warnings related to TypeDecorator EncryptedType()
, you should ensure that the ImportDatasetsCommand
is correctly processing the dataset contents and handling any encryption-related issues.
Here is a possible solution to add error handling and logging to the ImportDatasetsCommand
execution:
if not contents:
raise NoValidFilesFoundError()
passwords = (
json.loads(request.form["passwords"])
if "passwords" in request.form
else None
)
overwrite = request.form.get("overwrite") == "true"
sync_columns = request.form.get("sync_columns") == "true"
sync_metrics = request.form.get("sync_metrics") == "true"
ssh_tunnel_passwords = (
json.loads(request.form["ssh_tunnel_passwords"])
if "ssh_tunnel_passwords" in request.form
else None
)
ssh_tunnel_private_keys = (
json.loads(request.form["ssh_tunnel_private_keys"])
if "ssh_tunnel_private_keys" in request.form
else None
)
ssh_tunnel_priv_key_passwords = (
json.loads(request.form["ssh_tunnel_private_key_passwords"])
if "ssh_tunnel_private_key_passwords" in request.form
else None
)
try:
command = ImportDatasetsCommand(
contents,
passwords=passwords,
overwrite=overwrite,
sync_columns=sync_columns,
sync_metrics=sync_metrics,
ssh_tunnel_passwords=ssh_tunnel_passwords,
ssh_tunnel_private_keys=ssh_tunnel_private_keys,
ssh_tunnel_priv_key_passwords=ssh_tunnel_priv_key_passwords,
)
command.run()
return self.response(200, message="OK")
except Exception as ex:
logger.error(
"Error importing datasets: %s",
str(ex),
exc_info=True,
)
return self.response_500(message="Failed to import datasets")
Additionally, ensure that the encryption configuration is correctly set up in your application. The EncryptedType
requires a SECRET_KEY
to be specified in the app configuration. Here is an example of how the encryption setup should be initialized:
from superset.utils.encrypt import EncryptedFieldFactory
app = Flask(__name__)
app.config['SECRET_KEY'] = 'your_secret_key'
app.config['SQLALCHEMY_ENCRYPTED_FIELD_TYPE_ADAPTER'] = SQLAlchemyUtilsAdapter
encrypted_field_factory = EncryptedFieldFactory()
encrypted_field_factory.init_app(app)
Make sure that the SECRET_KEY
is correctly set in your application configuration. This key is essential for the EncryptedType
to function properly. If the SECRET_KEY
is missing or incorrect, it could lead to issues with the encryption and decryption processes, which might be causing the dataset import to fail silently [1][2][3][4].
To continue talking to Dosu, mention @dosu.
Comment From: rusackas
@kdaouda are you still facing this? I don't know the details personally, but it's been silent for a long time, so I wonder if others are facing this, or if it's still an issue in 4.1.1 or newer at all.
Comment From: rusackas
Assuming this is no longer an issue, so I'll close it as not-planned. If anyone's facing it in the newly released 5.0.0, we can most certainly/happily reopen.