Pandas PERF: Arrow dtypes are much slower than Numpy for DataFrame.apply

The same code with DataFrame.apply is >4x slower when the data is in Arrow dtypes versus Numpy.

Pandas version checks

[x] I have checked that this issue has not already been reported.
[x] I have confirmed this issue exists on the latest version of pandas.
[x] I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np
import pyarrow as pa
import time

NUM_ROWS = 500_000
df = pd.DataFrame({"A": np.arange(NUM_ROWS) % 30, "B": np.arange(NUM_ROWS)+1.0})
print(df.dtypes)
df2 = df.astype({"A": pd.ArrowDtype(pa.int64()), "B": pd.ArrowDtype(pa.float64())})
print(df2.dtypes)

t0 = time.time()
df.apply(lambda r: 0 if r.A == 0 else (r.B // r.A), axis=1)
print(f"Non-Arrow time: {time.time() - t0:.2f} seconds")

t0 = time.time()
df2.apply(lambda r: 0 if r.A == 0 else (r.B // r.A), axis=1)
print(f"Arrow time: {time.time() - t0:.2f} seconds")

Output with Pandas 2.3 on a local M1 Mac (tested on main branch too).

A      int64
B    float64
dtype: object
A     int64[pyarrow]
B    double[pyarrow]
dtype: object
Non-Arrow time: 3.21 seconds
Arrow time: 16.66 seconds

Installed Versions

INSTALLED VERSIONS ------------------ commit : 2cc37625532045f4ac55b27176454bbbc9baf213 python : 3.13.5 python-bits : 64 OS : Darwin OS-release : 24.3.0 Version : Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:16 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.3.0 numpy : 2.2.6 pytz : 2025.2 dateutil : 2.9.0.post0 pip : 25.1.1 Cython : 3.1.2 sphinx : None IPython : 9.3.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.13.4 blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : 2025.5.1 html5lib : None hypothesis : None gcsfs : 2025.5.1 jinja2 : None lxml.etree : None matplotlib : 3.10.3 numba : 0.61.2 numexpr : None odfpy : None openpyxl : 3.1.5 pandas_gbq : None psycopg2 : 2.9.10 pymysql : 1.4.6 pyarrow : 19.0.0 pyreadstat : None pytest : 8.4.1 python-calamine : None pyxlsb : None s3fs : 2025.5.1 scipy : 1.15.2 sqlalchemy : 2.0.41 tables : None tabulate : None xarray : None xlrd : 2.0.1 xlsxwriter : 3.2.5 zstandard : 0.23.0 tzdata : 2025.2 qtpy : None pyqt5 : None

Prior Performance

No response

Comment From: jbrockmendel

Any hotspots show up in profiling?

Comment From: ehsantn

Profiler output is a bit hard to read as usual. Here are some snakeviz screenshots. The fast_xs function has different code paths for ExtensionDtype that look suspicious to me. find_common_type and _from_sequence stand out looks like.

Comment From: jbrockmendel

That's a tough one. In core.apply.FrameColumnApply.series_generator we have a fastpath that only works with numpy dtypes.

We might be able to get some mileage for EA Dtypes by changing

            for i in range(len(obj)):
                yield obj._ixs(i, axis=0)

to something like

dtype = ser.dtype
for i in range(len(obj)):
    new_vals = the_part_of_fast_xs_after_interleaved_dtype_is_called()
    new_arr = type(ser.array)._from_sequence(new_vals, dtype=dtype)
    yield obj._constructor(new_arr, name=name)

That would save 20% of the runtime by avoiding the interleaved_dtype calls.