I tried reading an hd5 file with the latest pandas and got this import error:
ImportError: Missing optional dependency 'pytables'. Use pip or conda to install pytables.
So I tried pip install pytables
and got this error:
ERROR: Could not find a version that satisfies the requirement pytables (from versions: none)
ERROR: No matching distribution found for pytables
So then I went searching on PyPI and apparently there are no packages named pytables
: https://pypi.org/search/?q=pytables
I did find the PyTables project on GitHub though, which says that we need to use pip install tables
to install it. After installing tables
, the hd5 read operation worked.
So, we need to install tables, not pytables, which is definitely confusing and not obvious. I think it would be very helpful if the error message indicated this to avoid having to go through the search process above.
Comment From: user27182
Replacing these lines: https://github.com/pandas-dev/pandas/blob/c708e152c42f81b17cf6e47f7939a86e4c3fc77f/pandas/compat/_optional.py#L151-L157 with:
package_name = INSTALL_MAPPING.get(name, name)
msg = (
f"Missing optional dependency {package_name}. {extra} "
f"Use pip or conda to install package `{name}`."
)
and also changing this lines: https://github.com/pandas-dev/pandas/blob/c708e152c42f81b17cf6e47f7939a86e4c3fc77f/pandas/compat/_optional.py#L73 to:
"tables": "PyTables",
Produces a much clearer and more helpful error:
ImportError: Missing optional dependency PyTables. Use pip or conda to install package `tables`.
Comment From: KevsterAmp
Take
Comment From: user27182
Just a heads up that I don't think the fix I suggested will work as-is... the install mapping has the form {"import name" : "PyPI package name"} https://github.com/pandas-dev/pandas/blob/c708e152c42f81b17cf6e47f7939a86e4c3fc77f/pandas/compat/_optional.py#L62-L74
but for PyTables, we do import tables
and pip install tables
. So according to the mapping convention, this should be tables : tables
for PyTables... which is redundant, and so PyTables should be removed from the mapping altogether if this convention is used.
Instead, perhaps the mapping should be modified to include three names, with the import name as a key, and install name and colloquial name as a tuple for its value, e.g.:
# A mapping from import name to package name (on PyPI) and colloquial
# name for packages where these names are different.
INSTALL_MAPPING = {
"bs4": ("beautifulsoup4", "Beautiful Soup"),
"bottleneck": ("Bottleneck", "Bottleneck"),
"jinja2": ("Jinja2", "Jinja2"),
"lxml.etree": ("lxml", "lxml"),
"odf": ("odfpy", "odfpy"),
"python_calamine": ("python-calamine", "python-calamine"),
"sqlalchemy": ("SQLAlchemy", "SQLAlchemy"),
"tables": ("tables", "PyTables"),
}
And then inside of import_optional_dependency
, do:
other_names = INSTALL_MAPPING.get(name)
if other_names:
import_name = name
pypi_name = other_names[0]
colloquial_name = other_names[1]
else:
import_name = name
pypi_name = name
colloquial_name = name
msg = (
f"Missing optional dependency {colloquial_name}. "
f"Unable to import {import_name}. {extra} "
f"Use pip or conda to install package `{pypi_name}`."
)
Comment From: rhshadrach
PyTables uses pytables
on conda and tables
on PyPI. If the current wording is confusing, I would propose we merely say Use pip or conda to install the package
and leave it at that.
Comment From: user27182
PyTables uses
pytables
on conda andtables
on PyPI.
It seems perhaps the issue is more with the naming conventions adopted by PyTables. I think this is the source of the confusion... and so perhaps it's not reasonable to expect pandas to make sure error messages account for the poor naming choices from third party packages.
If the current wording is confusing, I would propose we merely say
Use pip or conda to install the package
and leave it at that.
The import error in this case is from import tables
, so are you proposing changing this
ImportError: Missing optional dependency 'pytables'. Use pip or conda to install pytables.
To this?
ImportError: Missing optional dependency 'pytables'. Use pip or conda to install the package.
With no mention of the import name tables
?
To me the confusing part was that the message made no mention of tables
yet that's both what I need to install (with pip) and import (with python), so this omission is part of the issue.
Why not include both the key and value from the mapping dict in the message? That way, both tables
and pytables
will be mentioned. That should give a hint to the user to try installing using one name, but if that fails then try installing using the other name.
Comment From: rhshadrach
I'd be fine with `import tables` failed, use pip or conda to install the pytables package
. But we shouldn't suggest the users look for either the import name nor the package name because as this example shows, it doesn't need to be either one.
Comment From: user27182
I'd be fine with
`import tables` failed, use pip or conda to install the pytables package
. But we shouldn't suggest the users look for either the import name nor the package name because as this example shows, it doesn't need to be either one.
+1. This message is much clearer, and I think would resolve this issue.