Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from IPython.display import display, HTML
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((2,3)))
display(df) # this will send print(df) to "text/plain": [...]
df_style = df.style.set_caption("data")
display(df_style) # this will send print(df_style) to "text/plain": [...]
Issue Description
Refer to my issue in nbconvert
Basically, nbconvert
use what is in "text/plain": [...]
. Displaying something using display
seems to send object.__repr__
or object.__str__
there. Sadly, pd.style.Styler
return <pandas.io.formats.style.Styler at 0x...>
on print. Please at least change pd.style.Styler.__repr__
and pd.style.Styler.__str__
into a human readable meaningful text, that is, at least, sending pd.style.Styler.data.__repr__
and pd.style.Styler.data.__str__
.
Expected Behavior
pd.style.Styler.__repr__
and pd.style.Styler.__str__
has a human readable and meaningful text, that is, at least, like pd.style.Styler.data.__repr__
and pd.style.Styler.data.__str__
.
Installed Versions
I use python 3.8, pd.show_versions throws error.
Here is pandas version: '2.0.3'
Comment From: Dhayanidhi-M
Hi @yasirroni ,
I am new contributor. Trying to understand the issue here. Could you explain what exact text you are looking for ?
Comment From: yasirroni
@Dhayanidhi-M I'm not looking any text. I'm reporting a bug or requesting feature that pd.style.Styler
should have human readable string representation on .__str__
or .__repr__
to be compatible with nbconvert
. That is because cell metadata on "text/plain": [...]
store what is in .__str__
or .__repr__
as far as I know.
Comment From: rhshadrach
Simpler reproducer:
df = pd.DataFrame(np.random.random((2,3)))
print(df)
# 0 1 2
# 0 0.107940 0.910253 0.657357
# 1 0.939727 0.032391 0.775682
df_style = df.style.set_caption("data")
print(df_style)
# <pandas.io.formats.style.Styler object at 0x7faf0fd21b10>
I think it makes sense to have the 2nd output match the first.
cc @attack68
Comment From: attack68
Thanks for cc
Just to keep this in perspective, this is not a bug, it is a feature request for something that has not been in scope previously.
Jupyter notebooks have a specific order of analysing the repr
methods for objects in the last line of a cell. A Styler can be represented in HTML format in a notebook becuase its __repr_html__
method by default returns a string of HTML.
If the pandas option styler.render.repr
is "latex"
then the __repr_html__
method returns None
and the __repr_latex__
method returns a string of LaTeX. This is how nbconvert
is configured to work for those formats, in general.
In your case, by using the print
function you are forcing Python to call the __str__
or __repr__
method, which do not exist for a Styler.
The fastest way of getting results in this case would be to note that Styler.to_string()
exists in a basic form, so that __str__
could be mapped to called self.to_string()
print(df.style.to_string())
# 0 1 2
# 0 0.554557 0.202806 0.479053
# 1 0.745892 0.921658 0.324225
I would ask, ultimately, what do you want to achieve by using a Styler in an nbconvert without adding any styles? Why not just call a DataFrame, since the to_string
method of a Styler serves no purpose beyond the normal string representation for a DataFrame?
Comment From: yasirroni
Thanks, @attack68.
The most basic thing is that we commonly did not print in notebook, right? We often just write like this in cell,
df = pd.DataFrame(np.random.random((2,3)))
df.style.set_caption("data")
Or sometimes, store the style,
df = pd.DataFrame(np.random.random((2,3)))
styled = df.style.set_caption("data")
styled
That works just fine, a styled DataFrame
will appear in the output, no need any fancy method
But, that output vanished in pdf due to nbconvert looking for "text/plain": [...]
.
So, whatever that pandas
choose to use, the rationale is that running jupyter nbconvert --to pdf PATH
should print at least non-styled table. Thus passing anything useful, at least "text/plain": [...]
, is expected.
If later the pandas
community decided to support HTML, LaTeX, or images, it can be in future scope I think.
Comment From: mdengler
pandas option
styler.render.repr
is"latex"
Setting styler.render.repr
to html
OR latex
causes jupyter nbconvert ... --to-pdf
to produce invalid LaTeX on a quick test of mine (Fedora 42, jupyterlab 4.4.3, other details can be provided):
[...]
Underfull \hbox (badness 10000) in paragraph at lines 587--588
! Undefined control sequence.
<recently read> \tableauto
[...]
Comment From: attack68
Using jupyter nbconvert for latex produced by Styler in a notebook with a pdf output was never in scope. If you would like to spec out the requirements and propose a pr, that would be great!
Comment From: mdengler
The title of this issue includes "nbconvert failure to display data". DataFrames are rendered fine by nbconvert
. Stylers
are not, including when styler.render.repr
is set to latex
. Where is the "scope" definition you are referring to?
Comment From: attack68
The "scope" is what I implemented and tested re LaTeX support for Styler. This particular conversion of LaTex displayed in a PDF output using nbconvert was not included in that. So it is missing, as are its tests. Some comments regarding how this might be implemented are included above, as well as some helpful insights from @yasirroni. I'm afraid I do not have the time at present to work on enhancements for Styler, but I am always willing to provide review for submitted PRs.
Comment From: mdengler
The "scope" is what I implemented and tested re LaTeX support for Styler.
If the requirements in "scope" are exactly what the implementation does, then any behavior different than the implementation, even if the implementation generates incorrect LaTeX, is "out of scope" by definition. That seems a useless definition of "scope".
To be more constructive, what implementation comments would you take a PR for?
Currently setting styler.render.repr
to latex
for this minimum example (github won't let me attach .ipynb
files):
#!/usr/bin/env python
# coding: utf-8
# In[1]:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((2,3)))
# In[2]:
df_style = df.style.set_caption("data")
df_style
# In[3]:
pd.set_option("styler.render.repr", "latex")
# In[4]:
df_style
...results in a) rendering LaTeX when the notebook is run in the Jupyter Lab/Notebook web ui, which is undesirable; and b) renders extremely poor .tex / PDF output (attached) when run with jupyter nbconvert pandas_issue_55908_reproducer.ipynb --execute --no-input --to pdf
...
So the first PR I would ask whether you would take a patch for, would be to change the Styler implementation so that running nbconvert --to latex ...
for this simple notebook:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((2,3)))
df
...would output the same .tex
file as for this notebook:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((2,3)))
df.style()
This is the original reporter's issue:
So, whatever that pandas choose to use, the rationale is that running jupyter nbconvert --to pdf PATH should print at least non-styled table. Thus passing anything useful, at least "text/plain": [...], is expected.
Would you review a PR for this type of change? Is there a better PR you would suggest, knowing what you do about the current implementation?
pandas_issue_55908_reproducer.pdf
Comment From: attack68
If the requirements in "scope" are exactly what the implementation does, then any behavior different than the implementation, even if the implementation generates incorrect LaTeX, is "out of scope" by definition. That seems a useless definition of "scope".
The implementation does not generate incorrect LaTeX.
Styler.to_latex
generates a string with correct LaTeX output. This was the primary scope of my implementation.- Getting LaTeX output rendered in a jupyter notebook on cell execution directly works. This was a secondary scope.
- Getting LaTeX output upon using nbconvert to display correctly was never in scope for testing.
You have hijacked a single word: "scope" taken from the context of a two sentence post above and begun a pedantic argument with a volunteer, unpaid developer of free software. Fine - whatever floats your boat, man. Go for it.
This was my advice above:
In your case, by using the print function you are forcing Python to call the
__str__
or__repr__
method, which do not exist for a Styler.The fastest way of getting results in this case would be to note that Styler.to_string() exists in a basic form, so that
__str__
could be mapped to called self.to_string()
If this is a viable solution, which might directly alter the above nbconverted
files, then it should be a simple, and very likely uncontroversial, addition, but you may have a better solution.
I disagree with this:
results in a) rendering LaTeX when the notebook is run in the Jupyter Lab/Notebook web ui, which is undesirable
In fact that was directly requested and is a used feature.
There may be additional options that can be explored, however.
...would output the same .tex file as for this notebook:
This seems sensible, perhaps that will fall under those additional options.
Comment From: mdengler
Thanks for cc
Just to keep this in perspective, this is not a bug, it is a feature request for something that has not been in scope previously.
This is the scope you cited, and it was used to justify not doing anything on this issue in 2023 and 2025. I'm happy to help, rather than argue about the scope: that was something you brought up.
Comment From: mdengler
This seems sensible, perhaps that will fall under those additional options.
So you would review a PR that caused nbconvert --to latex ...
of the provided example notebook to produce the same output from the cell ending in df
and the cell ending in df.style()
? Great. Any pointers to the implementation (I read all your comments above) would help.
Comment From: mdengler
- Getting LaTeX output rendered in a jupyter notebook on cell execution directly works. This was a secondary scope.
Please can you point me to any tests for this? I would like to test any PR changes before proposing a PR, and I'm not clear what you mean by "LaTeX output rendered in a jupyter notebook [...] directly".
Comment From: mdengler
I'm not clear what you mean by "LaTeX output rendered in a jupyter notebook [...] directly".
I think I understand now: did you mean that pandas.set_option('styler.render.repr', 'latex')
should cause the LaTeX returned from Styler.to_latex()
to be displayed as the output in an evaluated notebook cell?
Comment From: mdengler
note that Styler.to_string() exists [...], so that
__str__
could be mapped to [call] self.to_string()If this is a viable solution, which might directly alter the above
nbconverted
files, then it should be a simple, and very likely uncontroversial, addition, but you may have a better solution.
I don't pretend any better solution at all -- happy to follow that path. The holdup is that it's probably obvious to you how to implement "[Styler.__str__()
] could be mapped to [call] self.to_string()
", but that's exactly what's not obvious to me :). I'd of course be happy to be pointed to an implementation suggestion there: the behaviors and interactions of pandas, jupyter, and nbconvert are exactly the expertise that you have that would be most valuable to an implementor.
Comment From: mdengler
"[
Styler.__str__()
] could be mapped to [call]self.to_string()
"
Unfortunately, jupyter nbconvert pandas_issue_55908_reproducer.ipynb --execute --no-input --to pdf
does not result in any calls to Styler.__str__()
(only calls to __repr__()
), so the change will be a little more involved. I'll keep looking.
```ipynb pandas_issue_55908_reproducer.ipynb { "cells": [ { "cell_type": "code", "execution_count": 1, "id": "9722d2ac-b519-446b-b945-0fc031897d4f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | 0 | \n", "1 | \n", "2 | \n", "
---|---|---|---|
0 | \n", "0.049479 | \n", "0.484060 | \n", "0.955961 | \n", "
1 | \n", "0.101300 | \n", "0.290183 | \n", "0.014655 | \n", "
\n", " | 0 | \n", "1 | \n", "2 | \n", "
---|---|---|---|
0 | \n", "0.049479 | \n", "0.484060 | \n", "0.955961 | \n", "
1 | \n", "0.101300 | \n", "0.290183 | \n", "0.014655 | \n", "
**Comment From: attack68**
> > I'm not clear what you mean by "LaTeX output rendered in a jupyter notebook [...] directly".
>
> I think I understand now: did you mean that `pandas.set_option('styler.render.repr', 'latex')` should cause the LaTeX returned from `Styler.to_latex()` to be displayed as the output in an evaluated notebook cell?
Correct. See https://github.com/pandas-dev/pandas/blob/main/pandas/tests/io/formats/style/test_to_latex.py `def test_repr_option(styler):`
The key implementation is likely these lines in `style.py`:
```python
def _repr_html_(self) -> str | None:
"""
Hooks into Jupyter notebook rich display system, which calls _repr_html_ by
default if an object is returned at the end of a cell.
"""
if get_option("styler.render.repr") == "html":
return self.to_html()
return None
def _repr_latex_(self) -> str | None:
if get_option("styler.render.repr") == "latex":
return self.to_latex()
return None
I don't remember the order of what repr
methods the Notebook calls. The reason these return None is if None is fetched it passes and moves to the next in order of precendence. This defines what renders in the Notebook. I also don't know which repr
methods are required by nbconvert.
One suggestion to return all of the required repr
methods to a repr dict was looked at and dismissed in the past due to the potential of performance issues for large displays. That is why the option is used to return a result in the specific case that is desired.
Comment From: attack68
To code __str__
for Styler is probably just the following:
class Styler:
def __str__(self) -> str:
return self.to_string()
__repr__
is probably more complicated because of how it fits into the Jupyter notebook render chain, and what it might break (although maybe nothing).
Comment From: mdengler
To code
__str__
for Styler is probably just the following:
class Styler: def __str__(self) -> str: return self.to_string()
__repr__
is probably more complicated because of how it fits into the Jupyter notebook render chain, and what it might break (although maybe nothing).
Yes, I tried that __str__
implementation already, but neither jupyter nbconvert
-- nor the web notebook ui, of course -- calls __str__
-- just __repr__
.
See https://github.com/pandas-dev/pandas/blob/main/pandas/tests/io/formats/style/test_to_latex.py
def test_repr_option(styler):
Thanks -- useful.
I don't remember the order of what
repr
methods the Notebook calls. The reason these return None is if None is fetched it passes and moves to the next in order of precendence. This defines what renders in the Notebook.
This is very useful, thanks. If I should have been able to find this in some doc somewhere and you happen to come across it, please let me know so I can avoid future redundant questions.
I also don't know which
repr
methods are required by nbconvert.
I will look into that. That indeed is the key gap in my knowledge, too.
One suggestion to return all of the required
repr
methods to a repr dict was looked at and dismissed in the past due to the potential of performance issues for large displays. That is why the option is used to return a result in the specific case that is desired.
Thanks for the background.