Feature Type

  • [x] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

Hi! I found that there is an issue with the WPS image. The software allows images to be directly embedded into cells, and the format is similar to =DISPIMG ("ID5BA4F81A0D674C7AA8849A79AC5645C8", 1).

Image

Therefore, it cannot be accessed through worksheets. _images

If we unzip Excel, we can find all the images under xl/media, and the image indexes are in xl/-rels/cellimages.xml.rels and xl/ellimages.xml

This is a unique feature of WPS, at least I haven't found it in Office.

I found a similar implementation

Feature Description

This is my code, which will decompress Excel, read the file, and return an Id to address mapping

def wps_embed_images(file_path, save_path) -> dict:
    img_map = {}

    with zipfile.ZipFile(file_path, "r") as zip_ref:
        zip_ref.extractall(save_path)

    id2target = {}
    rels = os.path.join(save_path, "xl", "_rels", "cellimages.xml.rels")
    tree = ET.parse(rels)
    root = tree.getroot()
    for child in root:
        id2target[child.attrib.get("Id")] = os.path.join(save_path, "xl", child.attrib.get("Target"))

    namespaces = {
        'etc': 'http://www.wps.cn/officeDocument/2017/etCustomData',
        'xdr': 'http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing',
        'a': 'http://schemas.openxmlformats.org/drawingml/2006/main',
        'r': 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'
    }

    cellimages = os.path.join(save_path, "xl", "cellimages.xml")
    tree = ET.parse(cellimages)
    root = tree.getroot()
    for cell_image in root.findall('etc:cellImage', namespaces):
        c_nv_pr = cell_image.find('.//xdr:cNvPr', namespaces)
        image_name = c_nv_pr.get('name') if c_nv_pr is not None else None

        blip = cell_image.find('.//a:blip', namespaces)
        embed_id = blip.get(f'{{{namespaces["r"]}}}embed') if blip is not None else None

        if image_name and embed_id:
            img_map[image_name] = id2target[embed_id]

    return img_map

Alternative Solutions

We leave it as it is and I continue using the solution shown above.

Additional Context

No response

Comment From: DURUII

I’m all for adding WPS image‑in‑cell support to pandas.

Comment From: jbrockmendel

I’m not clear on what you’re asking for in pandas. A new method?