Pandas version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [x] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

Issue 1
import pyarrow as pa
array = pa.array([1.5, 2.5], type=pa.float64())
array.to_pandas(types_mapper={pa.float64(): pa.int64()}.get)

ArrowInvalid: Float value 1.5 was truncated converting to int64


Issue 2 
import pandas as pd
import pyarrow as pa
from decimal import Decimal

df = pd.DataFrame({"a": [Decimal("123.00")]}, dtype="string[pyarrow]")
df.to_parquet("decimal.pq", schema=pa.schema([("a", pa.decimal128(5))]))
result = pd.read_parquet("decimal.pq")
expected = pd.DataFrame({"a": ["123"]}, dtype="string[python]")

pd.testing.assert_frame_equal(result, expected)

AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="a") are different
Attribute "dtype" are different
[left]:  object
[right]: string[python]

Issue Description

Two issues have been observed when using pandas 2.2.3 with pyarrow >= 18.0.0: - Test cases Failing : pandas/tests/extension/test_arrow.py::test_from_arrow_respecting_given_dtype_unsafe and pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_roundtrip_decimal

  • Stricter float-to-int casting causes ArrowInvalid in tests like test_from_arrow_respecting_given_dtype_unsafe.

  • Decimal roundtrip mismatch: test_roundtrip_decimal fails due to dtype mismatches (object vs. string[python]) when reading back a decimal column written with a specified pyarrow schema.

These issues were not present with pyarrow==17.x.

Expected Behavior

  • Float to int casting should either handle truncation more gracefully (as in older versions) or tests should be updated to skip/adjust.

  • Decimal roundtrips to parquet should maintain the same pandas dtype or document clearly if type coercion is expected.

Installed Versions

python : 3.11.11 pandas : 2.2.3 pyarrow : 19.0.1

Comment From: phoebecd

In newer versions of PyArrow, type identity is stricter which is why this code is now causing errors.

Issue 1: Hi, the issue is that types_mapper={pa.float64(): pa.int64()}.get is not reliable in newer versions of PyArrow. This is because each call to pa.float64() creates a new object, so the key in your dictionary does not match the instance passed internally by PyArrow. I fixed the issue by converting the float to pandas and then casting the float to an int using truncation.

array = pa.array([1.5, 2.5], type=pa.float64())
s = array.to_pandas()       
s_int = s.astype(int) 

Issue 2: The issue is that both the values and types of result and expected are different. The result column is an object with a decimal value of 123.00, while the expected column is a string with a value of "123". I fixed this by converting the result column to a string and removing the trailing decimal places so that it matched the expected column.

df = pd.DataFrame({"a": [Decimal("123.00")]}, dtype="object")
df.to_parquet("decimal.pq", schema=pa.schema([("a", pa.decimal128(5))]))
result = pd.read_parquet("decimal.pq")
result["a"] = result["a"].apply(lambda x: str(x).split(".")[0] if isinstance(x, Decimal) else str(x))
result = result.astype({"a": "string"})

expected = pd.DataFrame({"a": ["123"]}, dtype="string[python]")
pd.testing.assert_frame_equal(result, expected)

Comment From: bhavya2109sharma

Thanks @phoebecd for the suggestions but I am running test cases implemented by pandas. AFAIK, pandas needs to fix these test cases in newer version so that pyarrow stricter identity type errors get resolved with a fix made by pandas.

Comment From: rhshadrach

~@bhavya2109sharma~ @phoebecd - Unfortunately your response does not help with this issue and only adds noise that maintainers spend time going through. I suspect it was generated by AI. If that is the case, please do not merely post the AI response to an issue. Using it as an aid when crafting a response is okay, but you should first feel confident that what you post is likely to be helpful.

Comment From: rhshadrach

Haven't checked if these failures are still happening on 2.2.x, but I believe they are not in 2.3.0. Can you confirm @bhavya2109sharma? If that's the case, we can close as the 2.2 series will not see anymore patches.