Feature Type
-
[ ] Adding new functionality to pandas
-
[X] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
ExtensionScalarOpsMixin is convenient for automatically adding arithmetic and comparison operator support to custom Extension Type ExtensionArrays, using scalar type implementations.
However, performing element-wise operations is very inefficient so it is often desirable to implement custom operator or comparison methods on the ExtensionArray class which could use vectorised approaches for much improved performance.
Currently, MyExtensionArray._add_arithmetic_ops()
and MyExtensionArray._add_comparison_ops()
overrides any existing defined operator overload methods on MyExtensionArray
, so if a developer wishes to define any custom operator implementation for enhanced performance, they must abandon ExtensionScalarOpsMixin
entirely and define ALL arithmetic and comparison operators themselves (not possible to have combination of manually defined operators and element-wise implementation fallback for the others)
Feature Description
I think it would be very useful for the _add_*_ops()
methods of ExtensionOpsMixin
to first check if each operator overload method is already defined on the class before adding it. This would allow the developer to define some particular custom operator methods on the custom ExtensionArray for enhanced performance, but with fallback to the element-wise implementation from ExtensionScalarOpsMixin for those that are not defined.
An example implementation could like like:
Original:
@classmethod
def _add_comparison_ops(cls) -> None:
setattr(cls, "__eq__", cls._create_comparison_method(operator.eq))
setattr(cls, "__ne__", cls._create_comparison_method(operator.ne))
setattr(cls, "__lt__", cls._create_comparison_method(operator.lt))
setattr(cls, "__gt__", cls._create_comparison_method(operator.gt))
setattr(cls, "__le__", cls._create_comparison_method(operator.le))
setattr(cls, "__ge__", cls._create_comparison_method(operator.ge))
New:
@classmethod
def _add_comparison_ops(cls) -> None:
for name, op in [
("__eq__", operator.eq),
("__ne__", operator.ne),
("__lt__", operator.lt),
("__gt__", operator.gt),
("__le__", operator.le),
("__ge__", operator.ge)]:
if not hasattr(cls, name):
setattr(cls, name cls._create_comparison_method(op))
This check could also potentially be controlled by an optional flag argument to _add_*_ops()
methods to maintain backwards compatible behaviour if desired.
Alternative Solutions
None that I know of
Additional Context
No response
Comment From: topper-123
I think this feature is very reasonable. Could also discuss if the conditional approach is too implicit and this should be more explicit:
@classmethod
def _add_comparison_ops(cls, exclude=None) -> None:
for name, op in [
("__eq__", operator.eq),
("__ne__", operator.ne),
("__lt__", operator.lt),
("__gt__", operator.gt),
("__le__", operator.le),
("__ge__", operator.ge)]:
if exclude is not not None and name in exclude:
setattr(cls, name cls._create_comparison_method(op))
Comment From: Finndersen
I can't think of any cases where someone would define their own overload methods on an extensionarray but then want ExtensionOpsMixin to override them? Pretty safe for it to just detect them itself and skip if existing?
Comment From: topper-123
That probably sounds reasonable also. Interested in making a PR on this? I can support both version and we can let other core devs think about it also?
Comment From: jbrockmendel
This seems reasonable. PR would be welcome.
Comment From: harshdesh7
take