dti = pd.date_range('2016-01-01', periods=5)
dti.value_counts().index.freq # <-- None
dti.factorize()[1].freq # <-- None
mi = pd.MultiIndex.from_arrays([dti, dti])
mi.levels[0].freq # <-- None
There is a comment in tests.indexes.datetimes.test_datetime test_factorize
suggesting that freq should be preserved by factorize, but that is not checked and would fail if it were
# freq must be preserved
idx3 = date_range("2000-01", periods=4, freq="M", tz="Asia/Tokyo")
exp_arr = np.array([0, 1, 2, 3], dtype=np.intp)
arr, idx = idx3.factorize()
tm.assert_numpy_array_equal(arr, exp_arr)
tm.assert_index_equal(idx, idx3)
So the question: do we want to try to preserve freq in factorize?
xref #33677 for the MultiIndex case
Update One more: Categorical:
dti = pd.date_range('2016-01-01', periods=5)
cat = pd.Categorical(dti)
cat.categories.freq # <-- None
Comment From: mroeschke
Looks like all but the value_counts
example is preserving freq now. Could use tests for those
In [8]: dti = pd.date_range('2016-01-01', periods=5)
...:
...: dti.value_counts().index.freq
In [9]: dti.factorize()[1].freq
Out[9]: <Day>
In [10]: mi = pd.MultiIndex.from_arrays([dti, dti])
In [11]: mi.levels[0].freq
Out[11]: <Day>
In [12]: dti = pd.date_range('2016-01-01', periods=5)
...: cat = pd.Categorical(dti)
...: cat.categories.freq
Out[12]: <Day>
Comment From: das-anubhav
Hi!
I would like to solve this issue.
Comment From: bgollop
take
Comment From: bgollop
@mroeschke, I noticed that tests were made for freq retention in DatetimeIndex factorize and MutliIndex (PR #38120). I was able to make a test case for Categorical, however I wasn't sure if a test case for value_counts was needed yet as it looks like it is still not preserving freq. Should I include it?
Comment From: mroeschke
I wasn't sure if a test case for value_counts was needed yet as it looks like it is still not preserving freq
Feel free to skip a test for value_counts unless you'd like to also fix the bug