Feature Type
-
[x] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
When reading Excel files, pandas ignores Excel's "Text" cell formatting and converts text-formatted numbers (e.g., IDs, codes) to numeric types (int/float). This requires manual conversion back to strings, which can be inefficient , for a huge dataset and prone to errors.
Feature Description
Add an option in pd.read_excel() to respect Excel's cell formatting (e.g., dtype_from_format=True), or set it to true by default , preserving text-formatted columns as strings.
Alternative Solutions
OpenPyXL/Xlrd Engine + Format Detection Read cell formats directly (requires manual parsing):
from openpyxl import load_workbook
wb = load_workbook("data.xlsx", data_only=False) sheet = wb.active text_columns = [col for col in sheet.columns if sheet.cell(row=1, column=col[0].column).number_format == "@"]
Additional Context
No response