[SIP] Proposal for Enhanced Aggregation Customization in Totals/Subtotals

Motivation

Currently, Apache Superset applies the same aggregation function across all fields when rendering row and column totals and subtotals in pivot tables and similar visualizations. This creates a limitation when users need different aggregation logic for different fields.

For instance:

  • A user may want SUM to be applied on a metric like "Total Disbursed Amount" and AVERAGE or custom % contribution on another metric like "% Booked".

  • Currently, this is not supported — once a total/subtotal aggregation function is selected, it applies uniformly to all fields in both row and column aggregations.

This constraint reduces the flexibility and usability of dashboards with multi-metric visualizations where different fields inherently require different aggregation strategies.

Proposed Change

Introduce support for field-level aggregation customization in the totals and subtotals logic. This could be achieved by:

  1. Allowing users to configure aggregation function per metric for subtotals and totals in the Explore view.

  2. Providing a UI interface to specify different aggregation functions for:

    • Each metric

    • Row totals/subtotals vs. column totals/subtotals

  3. Optionally, allowing users to write custom SQL or expressions for aggregations of totals/subtotals per metric.

Example UI Configuration Table:

Metric Name | Row Subtotal | Column Subtotal | Total -- | -- | -- | -- Total Disbursed Amt | SUM | SUM | SUM % Booked | AVERAGE | CUSTOM_SQL | AVG

This would allow power users to better control how summaries are presented for different KPIs in a single view.

New or Changed Public Interfaces

  • Changes in Explore UI to configure aggregation per metric (backend support via metadata or control panel enhancements).

  • Updates in the pivot table chart plugin to support this flexibility.

  • Changes to backend logic (viz.py, data.py, or applicable endpoint handlers) to apply selected aggregations per field for subtotal/total calculations.

  • Potential update to how query_object or post_processing is constructed to allow passing per-metric aggregation.

New Dependencies

No new dependencies are anticipated at this point. If a more advanced expression evaluation engine is required, the use of a library like NumExpr or enhancement of existing Pandas logic may be explored.

Migration Plan and Compatibility

  • Backward-compatible by default: if no custom aggregations are defined per metric, the current behavior (uniform aggregation) remains intact.

  • Optional migration for dashboards wanting the new functionality.

  • Minor metadata schema change might be needed to store aggregation preferences per metric.

Rejected Alternatives

  • Hardcoding aggregation behavior for specific metrics: Too inflexible and not user-configurable.

  • Allowing only one function at a time: Maintains the current limitation and doesn't meet the requirement for differentiated metric aggregation.

Comment From: rusackas

I think we'd all support having controls for this in the UI, and are open to proposals for how that might work from a design/UI standpoint (CC @kasiazjc for input/feedback)

It might also be worth noting that we've been trying to kill viz.py forever (which requires migrating some more visualizations away from using it), so we're happy to modifications there, but are looking to reduce/remove complexity in this area as much as possible.