Research
-
[x] I have searched the [pandas] tag on StackOverflow for similar questions.
-
[x] I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
https://stackoverflow.com/search?page=3&tab=Relevance&pagesize=30&q=pandas%20AND%20tzdata%20&searchOn=3
Question about pandas
We are on pandas 1.5.3. We are investigating some performance bottlenecks. At this point, we are not sure where the problem lies. However, we have a consistent pattern of observations.
We noticed that when tzdata==2025.2
was uninstalled, there was a severe degradation in performance (> 10x).
Upon further investigations and eliminations, we arrived at the following matrix:
Good perf-1
pandas==2.2.3
tzdata==2025.2
Good perf-2
pandas==1.5.3
tzdata==2025.2
Bad perf
No tzdata
pandas==1.5.3
Any suggestions ?
Is there any logic in any part of Pandas that relies on tzdata
?
Thanks, Sau
Comment From: sdg002
Sorry. SFO will not allow me to ask any more questions.
Comment From: mroeschke
tzdata
is used as an alternative timezone library compared to pytz
. It's hard to answer performance implications without knowing what they are.
Note that pandas 1.5 is no longer supported and tzdata
is a required dependency as of pandas 2.0
Comment From: sdg002
Hello @mroeschke ,
Thanks for the quick reply. Based on your response, We have narrowed down our scenarios to the following:
Good - pandas 2.2.3 and tzdata
When pandas 2.2.3
is installed, tzdata
gets installed too (tallies with what you have commented). This is our best case and everything works fine.
Bad - pandas 2.2.3 only
However, what surprised us that when tzdata
is uninstalled using pip uninstall
, the code continues to run without any errors. But, the performance is 20X slower.
We were expecting an error to be thrown.
It would be very helpful, if you can explain what pandas 2.2.3 does internally when the tzdata
package is missing ?
Thanks, Sau
Comment From: jbrockmendel
explain what pandas 2.2.3 does internally when the tzdata package is missing
Well, it should raise at import-time since there's a check for it in pandas/__init__.py
@mroeschke i see it is also listed in compat/_optional. Maybe should be removed there?
Comment From: mroeschke
Maybe should be removed there?
Yes definitely