Feature Type

  • [X] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

Each tool has a specific structure for processing multidimensional data with the following consequences:

  • interfaces dedicated to each tool,
  • partially processed data,
  • no unified representation of data structures

Feature Description

The proposed format (see jupyter notebook, github repository, PyPI package ) is based on the following principles:

  • neutral format available for tabular or multidimensional tools (e.g. Numpy, pandas, xarray, scipp, astropy),
  • taking into account a wide variety of data types as defined in NTV format,
  • high interoperability: reversible (lossless round-trip) interface with tabular or multidimensional tools,
  • reversible and compact JSON format,
  • Ease of sharing and exchanging multidimensional and tabular data,

Alternative Solutions

A first tool exists (NTV-pandas) but only deals with the JSON interface and the analysis of tabular structures. It would therefore be extended to multidimensional structures.

Additional Context

https://github.com/numpy/numpy/issues/12481#issuecomment-2049179803 https://github.com/astropy/astropy/issues/16286 https://github.com/pydata/xarray/issues/8927 https://github.com/scipp/scipp/issues/3422

Comment From: jbrockmendel

Can you expand on what you're asking for here? I'm guessing you want a pd.read_this_new_thing and DataFrame.to_this_thing?

Comment From: loco-philippe

Currently, there is a first level of support: - in Pandas: reference in the 'NTV-pandas' ecosystem (custom accessors) - in Xarray: integration in the Xarray user guide

The desired integration can indeed include the IO functions: pd.read_this_new_thing and DataFrame.to_this_thing, as well as a DataFrame analysis function to identify the multidimensional structures underlying the DataFrame.

The level of integration via custom accessors may be sufficient unless you specifically want to integrate this structure analysis function (added value for Pandas) or if you want to integrate an export function into Pandas for multidimensional tools like Xarray (which does not currently exist).

Comment From: jbrockmendel

or if you want to integrate an export function into Pandas for multidimensional tools like Xarray (which does not currently exist).

DataFrame.to_array exists. Do you mean something different?

Comment From: loco-philippe

Yes, DataFrame.to_array exists but as explain in the Xarray user guide - Lossless and reversible conversion, a non-dimension coordinate is converted into variable because DataFrame.to_array does not parse the tabular structure (eg it does not identify a non-dimension coordinate and simply converts it to a variable).