Research

  • [x] I have searched the [pandas] tag on StackOverflow for similar questions.

  • [x] I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

None

Question about pandas

ssue Description TL;DR: Since pandas 2.0+, .tolist() and similar methods return NumPy types instead of native Python types, severely impacting user experience and data readability. Problem Example Before (pandas 1.x): pythondf.index.tolist()

Returns: [0, 1, 2, 3, 4] # Clean, readable

Now (pandas 2.x): pythondf.index.tolist()

Returns: [np.int64(0), np.int64(1), np.int64(2), np.int64(3), np.int64(4)] # Verbose, confusing

Impact on User Experience

Poor Readability: Results are cluttered with np.int64(), np.float64() wrappers Debugging Nightmare: Harder to quickly scan and understand data Display Issues: When printing or logging, output is unnecessarily verbose User Confusion: Many users don't understand why they're seeing NumPy types Breaking Change: Existing code expectations broken without clear migration path

Current Workarounds Are Painful Users now need to write additional code for basic operations: python# Instead of simple: indices = df.index.tolist()

We need:

indices = [int(x) for x in df.index.tolist()] The Core Problem DataFrames are meant for data analysis and exploration. The primary use case is human-readable data inspection, not performance-critical numerical computation at the .tolist() level. Suggested Solutions

Add a parameter: .tolist(native_types=True) (default True for user-facing methods) Separate methods: Keep .tolist() for NumPy types, add .tolist_clean() for Python types Configuration option: Allow users to set pandas behavior globally Revert the change: Prioritize user experience over marginal performance gains

Why This Matters Pandas' strength has always been its ease of use and intuitive behavior. This change sacrifices user experience for performance gains that most users don't need when calling .tolist(). The goal of data analysis is insight, not fighting with data types. Request Please consider reverting this behavior or providing a simple, built-in solution. The current situation forces every pandas user to write boilerplate code for basic data inspection. Thank you for maintaining this incredible library. I hope we can find a solution that balances performance with the user-friendly experience that makes pandas great.

Environment:

pandas: 2.2.3 numpy: 1.26.4 Impact: All DataFrame operations returning lists

Comment From: simonjayhawkins

Thanks @COderHop for the report.

It appears that you have a good grasp of the issue. IIRC this has been reported/discussed before but I can't find it at this time.

Breaking Change: Existing code expectations broken without clear migration path

I do not agree that from the pandas perspective this is true. Numpy made a change to their repr and pandas continues to return Numpy types as before, only the repr has changed and that should not really be considered a pandas issue.

However, to be fair, many users were probably unaware before that their lists contained numpy types and not Python types which would have perhaps been a more logical design choice. If pandas had however changed the return type this would have been a breaking change.

Please consider reverting this behavior or providing a simple, built-in solution.

IIRC other discussions have suggested making this breaking change in a future release in the the return type of some operations for which a return of standard Python objects would be appropriate. This seems reasonable to me.

Even though I'm sure this is a duplicate issue, I'll leave it open until I can find the other issues or until someone else point us in the right direction.

@mroeschke IIRC you did some PRs at some point related to this to fix ci?

Comment From: mroeschke

Since this issue is focusing on tolist conversion, there has been discussion that generally conversion to Python collections should also convert scalars to Python scalars, but the pandas API is inconsistent with this conversion in other APIs as well (e.g. __iter__)