Feature Type
-
[ ] Adding new functionality to pandas
-
[X] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
This isn't a problem right now, but I just wanted to call it out ahead of 3.0:
https://github.com/pandas-dev/pandas/blob/8b77271393efc678b9897e009c3201e6eeef57fc/pandas/io/parquet.py#L433-L437
Won't really make sense in 3.0 and onwards, because:
https://github.com/pandas-dev/pandas/blob/8b77271393efc678b9897e009c3201e6eeef57fc/doc/source/whatsnew/v2.2.0.rst?plain=1#L87
Feature Description
I guess just always use pyarrow unless explicitly told to use fastparquet?
Alternative Solutions
N/A
Additional Context
No response
Comment From: martindurant
I guess just always use pyarrow unless explicitly told to use fastparquet?
This would be I suppose the functional equivalent to the current behaviour. Already, many more things are tested with arrow only, which I would say is quite a shame. I may be biased, however.
Comment From: rhshadrach
Thanks for the report. I think we can just change the line
The default
io.parquet.engine
behavior is to try 'pyarrow', falling back to 'fastparquet' if 'pyarrow' is unavailable.
to something like
The default value
io.parquet.engine
to use 'pyarrow'.
Does that cover everything @davetapley?
Comment From: phofl
I think we should deprecate fast parquet altogether when 3.0 is here, there isn't really much use for it and it doesn't support some things that arrow supports and we use.
Simplifying code complexity is another benefit of doing this
Comment From: davetapley
@rhshadrach yeah, I might have been over thinking it, perhaps it is just a docs change.
I have no opinion on fast parquet support.