Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
multi_index = pd.MultiIndex.from_tuples([
("A", "1"),
("A", "2"),
("B", "1"),
("B", "2"),
("C", "1"),
("C", "2"),
("A", "3"),
("B", "3"),
("C", "3"),
], names=['level_1', 'level_2'])
values = [f"{l1} {l2}" for l1, l2 in multi_index]
series = pd.Series(values, index=multi_index, name='Example')
series.groupby(['level_1', 'level_2'], sort=False).sum().unstack('level_2', sort=False)
Issue Description
The code above generates this output:
| level_1 | 1 | 2 | 3 |
|:----------|:----|:----|:----|
| A | A 1 | A 2 | B 1 |
| B | B 2 | C 1 | C 2 |
| C | A 3 | B 3 | C 3 |
where the values are not placed correctly in their respective columns and indices. For example, the value B 1 can only belong to the row B and column 1, but here it appears in A3.
Expected Behavior
The issue disappears if any of the sort flags is set to True. Namely, replacing the last line with any of these 3 lines returns the expected output:
series.groupby(['level_1', 'level_2'], sort=True).sum().unstack('level_2', sort=False).series.groupby(['level_1', 'level_2'], sort=True).sum().unstack('level_2', sort=True).series.groupby(['level_1', 'level_2'], sort=False).sum().unstack('level_2', sort=True).
The issue only seems to appear when both groupby and unstack have sort=False.
Installed Versions
Comment From: JuanseHevia
Hi, is there a fix for this? I'm encountering the same issue
Comment From: rhshadrach
Thanks for the report. I am seeing the following output on the main branch:
level_2 1 2 3
level_1
A A 1 A 2 A 3
B B 1 B 2 B 3
C C 1 C 2 C 3
You've checked the box that you still encounter the bug on the main branch. Can you confirm that you have not done this, but only checked the latest released version of pandas?
Closing for now until there is a reproducer.
Comment From: nocoding03
Thanks for the report. I am seeing the following output on the main branch:
level_2 1 2 3 level_1 A A 1 A 2 A 3 B B 1 B 2 B 3 C C 1 C 2 C 3You've checked the box that you still encounter the bug on the main branch. Can you confirm that you have not done this, but only checked the latest released version of pandas?
Closing for now until there is a reproducer.
@rhshadrach I have verified the existence of this bug on both versions 2.3.3 and 2.3.2, but on the main branch pandas wokrs well.