Feature Type

  • [ ] Adding new functionality to pandas

  • [X] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

This isn't a problem right now, but I just wanted to call it out ahead of 3.0:

This behavior:

https://github.com/pandas-dev/pandas/blob/8b77271393efc678b9897e009c3201e6eeef57fc/pandas/io/parquet.py#L433-L437

Won't really make sense in 3.0 and onwards, because:

https://github.com/pandas-dev/pandas/blob/8b77271393efc678b9897e009c3201e6eeef57fc/doc/source/whatsnew/v2.2.0.rst?plain=1#L87

Feature Description

I guess just always use pyarrow unless explicitly told to use fastparquet?

Alternative Solutions

N/A

Additional Context

No response

Comment From: martindurant

I guess just always use pyarrow unless explicitly told to use fastparquet?

This would be I suppose the functional equivalent to the current behaviour. Already, many more things are tested with arrow only, which I would say is quite a shame. I may be biased, however.

Comment From: rhshadrach

Thanks for the report. I think we can just change the line

The default io.parquet.engine behavior is to try 'pyarrow', falling back to 'fastparquet' if 'pyarrow' is unavailable.

to something like

The default value io.parquet.engine to use 'pyarrow'.

Does that cover everything @davetapley?

Comment From: phofl

I think we should deprecate fast parquet altogether when 3.0 is here, there isn't really much use for it and it doesn't support some things that arrow supports and we use.

Simplifying code complexity is another benefit of doing this

Comment From: davetapley

@rhshadrach yeah, I might have been over thinking it, perhaps it is just a docs change.

I have no opinion on fast parquet support.