Bug description

Hello,

After enabling "SERVER PAGINATION" in the "Table" chart, I noticed sorting the table will only sort the page that's currently loaded. For example: Screenshot 2024-09-10 at 6 02 03 PM We can see the first page is sorted in descending order by the date, and 2:43:53 on September 10 is the latest date shown.

However, when we go to the last page: Screenshot 2024-09-10 at 6 02 49 PM

we can see a few rows with 21:56:57 on September 10, which is obviously later than the above date. I'm assuming the issue is that only the page that's loaded is sorted (on the client), rather than being done through the database which would be ideal.

How to reproduce the bug

  1. Create a "Table" chart with a significant number of rows (enough so there are two or more pages of server side results). Ensure at least one column, which I'll refer to as test_column, contains only unique values. To ensure reproducibility, the natural order of test_column in the database shouldn't be based off its value.
  2. Sort the table by test_column
  3. Observe how test_column is only sorted properly on each respective page.

Screenshots/recordings

See bug description

Superset version

4.0.2

Python version

3.10

Node version

18 or greater

Browser

Firefox

Additional context

I've only tested this with "Raw Records" on the "Table" chart. I doubt this would differ with "Aggregate", but I'll mention that just in case.

Checklist

  • [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Comment From: dosubot[bot]

Hello, @sam-hieken! I'm here to help you with your Superset issue. I can assist with bug solving, answering questions, and becoming a contributor.

Server pagination in the Table chart in Superset version 4.0.2 only sorts the currently loaded rows instead of sorting through the database because the server_pagination feature is experimental and not fully integrated to handle sorting at the database level. The configuration for server pagination is defined as follows:

{
  name: 'server_pagination',
  config: {
    type: 'CheckboxControl',
    label: t('Server pagination'),
    description: t('Enable server side pagination of results (experimental feature)'),
    default: false,
  },
}

This indicates that server pagination is an experimental feature, and its current implementation does not support sorting through the entire database. Instead, it only sorts the rows that are currently loaded in the client. The DataTable component's implementation reflects this by filtering and sorting data on the client side based on the currently loaded rows, rather than querying the database for sorted data. The relevant code in DataTable.tsx shows that sorting is handled by the useSortBy hook from react-table, which operates on the data available in the client:

const tableHooks: PluginHook<D>[] = [
  useGlobalFilter,
  useSortBy,
  usePagination,
  useColumnOrder,
  doSticky ? useSticky : [],
  hooks || [],
].flat();

This setup confirms that sorting is performed on the client side, and the server pagination feature does not yet support database-level sorting [1][2][3].

To continue talking to Dosu, mention @dosu.

Comment From: sam-hieken

This also appears to be the same issue when using the search bar on tables with server pagination - it'll only search the currently loaded page.

Comment From: mboujaha

+1 any updates on this

Comment From: mboujaha

It seems that the DataTable component in superset is lacking some important features, it is by far the component we use the most,

  1. with server pagination, sorting and searching are only applied to the loaded items which is very unintuitive.
  2. with server pagination, the export to csv and excel requires a workaround in order the export the entire datatable, by default it only exports the loaded items.
  3. conditional formatting can only be applied to numeric fields and not strings or booleans.
  4. Can't show or hide columns by default or by the end user.
  5. Limited options for cell customization.

please feel free to add to the list.

Comment From: sam-hieken

Hey @mboujaha, I totally agree with the suggestions, though I feel they'd be better suited for a separate issue (except for numbers 1 & 2).

For number 3 and 5, I actually just found a workaround a few days ago I posted here. Not super ideal since it involves creating a calculated column, but it does work.

Comment From: paulobitar

This would be very helpful! My clients always complain about this.

Comment From: rusackas

@sadpandajoe (or anyone we can relay this to, perhaps @msyavuz ) - I think there are a whole host of issues here that happen with server pagination (particularly when the "Show X Entries" dropdown is added) around whether server pagination works at all, or sorting works as expected. I wonder if we can extend our test suite here and open/fix some issues to iron out the kinks here.

Comment From: msyavuz

@rusackas I think these issues might be already fixed by #33357 in master since i can't reproduce the bug. That pr also introduces e2e tests for exactly this here.