Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('mysql+pymysql://user:pwd@host/Data', echo=True)
with db.engine.begin() as conn:
df = pd.read_sql_table('LocationIDs', conn.conection)
Traceback (most recent call last):
File "example.py", line 148, in <module>
df = pd.read_sql_table('LocationIDs', conn.connection)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 385, in read_sql_table
if not pandas_sql.has_table(table_name):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2865, in has_table
return len(self.execute(query, [name]).fetchall()) > 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2686, in execute
raise ex from exc
pandas.errors.DatabaseError: Execution failed on sql '
SELECT
name
FROM
sqlite_master
WHERE
type IN ('table', 'view')
AND name=?;
': not all arguments converted during string formatting
python-BaseException
Issue Description
-
The connection to the database does not work. Moreover, this error is similar for all SQL methods that use the internal method
pandasSQL_builder()
. https://github.com/pandas-dev/pandas/blob/d96646219618e007f64ee49e0a6e20f4aea761b5/pandas/io/sql.py#L870 -
Although I explicitly specify a working MySQL connection, the internal
pandasSQL_builder()
method changes it to an SQLite connection without any notification. Only after several hours of research and googling I noticed on stackoverflow a post about this problem and guessed that the problem refers with SQLite. -
pandasSQL_builder()
requirescon
to be asqlalchemy.engine.Connectable
instance . Such instances in sqlalchemy are theengine
andconn
objects (with db.engine.begin() as conn:
), but they do not have acursor
object, which is required. Whereasconn.connection
has acursor
object, but it is rejected because it is not aConnectable
instance. As a result, connecting via sqlalchemy is impossible. https://github.com/pandas-dev/pandas/blob/d96646219618e007f64ee49e0a6e20f4aea761b5/pandas/io/sql.py#L890 -
Also
pandasSQL_builder()
does not detect thesqlalchemy
module I have installed. This detection is necessary for the above check and for the ability to specify the database URL in the form of a literal (likepd.read_sql_table(tablename, con='mysql+pymysql://user:pwd@host/Data' )
). https://github.com/pandas-dev/pandas/blob/d96646219618e007f64ee49e0a6e20f4aea761b5/pandas/io/sql.py#L885
As a result, no connection via sqlalchemy is possible.
I have a suspicion that the problem may be partitial linked that I'm using SQLAlchemy v1.4.49, as required by some of my code's dependencies, since many Python modules stlll not support SQLAlchemy v2. But then I would like Pandas support for the previous version of Sqlalchemy, or locking it as a dependency on a specific version up to which Pandas has support.
- Also I found issue in the method that looks for installed SQLAlchemy, but it doesn't find it. This detects the SQLAlchemy installed. But while the minimum version is v2 set in it (
VERSIONS = {"sqlalchemy": "2.0.0",}
) and a notification is assigned there that Pandas requires a version no lower than this (msg
at line 175). But since the detection is made with theignore
argument (sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore")
andmod = import_optional_dependency(modname, errors="ignore")
inpd.show_versions()
), the message is not shown and is returned the result isNone
, as if no installed SQLAlchemy was found. https://github.com/pandas-dev/pandas/blob/d96646219618e007f64ee49e0a6e20f4aea761b5/pandas/compat/_optional.py#L171
Expected Behavior
Should work: * Connect via connection SQLAlchemy. * Connect via literal database URL. * There should be no implicit silent connection to non-existent SQLite, when a MySQL (or other) connection is explicitly specified. * Should show SQLAlchemy installed, whereas currently it shows no installation when SQLAlchemy is installed. * If Pandas requires SQLAlchemy to be a "minimum version" of 2.0.0, this must be reported.
Installed Versions
Comment From: vladiscripts
Curiously, another method works: df = pd.read_sql('SELECT * FROM LocationIDs', conn.connection)
. Although it also does not define the connection correctly, PandasSQL_builder()
gives the MySQL connection the wrong sqlite class SQLiteDatabase
.
https://github.com/pandas-dev/pandas/blob/d96646219618e007f64ee49e0a6e20f4aea761b5/pandas/io/sql.py#L690
Comment From: Animesh-Shukla
take