Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A follow up on the issue of Figure caption #359

Open
DBGVA opened this issue Feb 2, 2017 · 24 comments
Open

A follow up on the issue of Figure caption #359

DBGVA opened this issue Feb 2, 2017 · 24 comments

Comments

@DBGVA
Copy link

DBGVA commented Feb 2, 2017

Thanks to the latest developments presented in #137 it is now possible to insert an automatic figure or table caption using the following code.

def Figure(paragraph):
run = run = paragraph.add_run()
r = run._r
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'begin')
r.append(fldChar)
instrText = OxmlElement('w:instrText')
instrText.text = ' SEQ Figure * ARABIC'
r.append(instrText)
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'end')
r.append(fldChar)

def Table(paragraph):
run = run = paragraph.add_run()
r = run._r
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'begin')
r.append(fldChar)
instrText = OxmlElement('w:instrText')
instrText.text = ' SEQ Table * ARABIC'
r.append(instrText)
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'end')
r.append(fldChar)

-----------in the main document-------

paragraph = document.add_paragraph('Figure ', style='Caption')
Figure (paragraph)

--- while the figure or table number might not appear immediately they will show up after an update
moreover the code can be viewed using the "toggle field view"

Thanks to everybody

@Anne1029384756
Copy link

Hello,

I am relatively new to Python and python-docx.
I copied the exact code above to my python script in order to add captions to the figures in my document. However, I received the following error:

in Figure
fldChar = OxmlElement('w:fldChar')
NameError: name 'OxmlElement' is not defined

Can anyone clarify this issue for me? Thanks so much!

@scanny
Copy link
Contributor

scanny commented Feb 27, 2018

@Anne1029384756 I believe if you add from pptx.oxml import OxmlElement above your code that will do the trick.

@Anne1029384756
Copy link

Thank you for your reply! I installed python-pptx using pip install. I import the element as described, but I still get an error. Any thoughts on this one?

from pptx.oxml import OxmlElement
ImportError: cannot import name 'OxmlElement'

@DBGVA
Copy link
Author

DBGVA commented Feb 27, 2018 via email

@scanny
Copy link
Contributor

scanny commented Feb 27, 2018

Sorry, no need to install python-pptx. Import should be from docx.oxml import OxmlElement

@Anne1029384756
Copy link

@DBGVA: I am unable to find/see your example code. Where did you put it?

@scanny: after succesfully importing the OxmlElement from docx.oxml, I now get a different error. I tried importing qn in a similar manner, but that is not possible.

fldChar.set(qn('w:fldCharType'), 'begin')
NameError: name 'qn' is not defined

@DBGVA
Copy link
Author

DBGVA commented Feb 28, 2018 via email

@Anne1029384756
Copy link

Anne1029384756 commented Feb 28, 2018 via email

@DBGVA
Copy link
Author

DBGVA commented Feb 28, 2018 via email

@Anne1029384756
Copy link

@DBGVA Thanks! I have both code versions working at the moment!

@scanny Importing both of the following fixed the initial code:
from docx.oxml import OxmlElement
from docx.oxml.ns import qn

Next step for me is fixing references to specific figures/tables. (I am used to working with LateX). I will post my question in another feed, but if you also have a solution for that please let me know!

@atulapra
Copy link

With this code, the caption appears above the image.
Is there any way to make the caption appear below the image?

@Anne1029384756
Copy link

Change the order and add the picture before adding the caption part:

document.add_picture()

paragraph = document.add_paragraph('Figure ', style='Caption')
Figure(paragraph)
paragraph.add_run(' Figure Caption ')

@wheeled
Copy link

wheeled commented Jul 30, 2018

Just a note for those who fell into the same trap:
there should be a backslash ahead of the asterisc in the line
instrText.text = ' SEQ Table * ARABIC'

While trying to figure out why I was getting "bookmark not found" errors, I also discovered a kluge that works if you open the document into MS Word:

def Figure(p): fldChar = OxmlElement('w:fldSimple') fldChar.set(qn('w:instr'), ' SEQ Figure \* ARABIC ') p._p.append(fldChar)
That produces incomplete xml which is then fixed up by Word.
<w:p> <w:pPr> <w:pStyle w:val="Caption"/> </w:pPr> <w:r> <w:t xml:space="preserve">Figure </w:t> </w:r> <w:fldSimple w:instr=" SEQ Figure \* ARABIC "/> <w:r> <w:t>: Correlation Group response</w:t> </w:r> </w:p>

I'm sure this is a fully unsupported approach, though.

@rvcristiand
Copy link

Just a note for those who fell into the same trap:
there should be a backslash ahead of the asterisc in the line
instrText.text = ' SEQ Table * ARABIC'

While trying to figure out why I was getting "bookmark not found" errors, I also discovered a kluge that works if you open the document into MS Word:

def Figure(p): fldChar = OxmlElement('w:fldSimple') fldChar.set(qn('w:instr'), ' SEQ Figure \* ARABIC ') p._p.append(fldChar)
That produces incomplete xml which is then fixed up by Word.
<w:p> <w:pPr> <w:pStyle w:val="Caption"/> </w:pPr> <w:r> <w:t xml:space="preserve">Figure </w:t> </w:r> <w:fldSimple w:instr=" SEQ Figure \* ARABIC "/> <w:r> <w:t>: Correlation Group response</w:t> </w:r> </w:p>

I'm sure this is a fully unsupported approach, though.

You need write two backslash and not just one.

instrText.text = ' SEQ Table \\* ARABIC'

@DinoMetu
Copy link

Dear all,

1) it is possible to restart figure numbering when new heading is added:

document.add_heading('A heading', 0)
'1' - Figure name 1
'2' - Figure name 2
document.add_heading('B heading', 0)
'1' - Figure name 1
'2' - Figure name 2

2) or its is possible to display figure numbering as below (even better):

document.add_heading('A heading', 0)
'A-1" - Figure name 1
'A-2' - Figure name 2
document.add_heading('B heading', 0)
'B-1' - Figure name 1
'B-2' - Figure name 2

I am using the following Figure definition (as per above in the thread)
def Figure(paragraph): run = run = paragraph.add_run() r = run._r fldChar = OxmlElement('w:fldChar') fldChar.set(qn('w:fldCharType'), 'begin') r.append(fldChar) instrText = OxmlElement('w:instrText') instrText.text = ' SEQ Figure \* ARABIC' r.append(instrText) fldChar = OxmlElement('w:fldChar') fldChar.set(qn('w:fldCharType'), 'end') r.append(fldChar)

Thanks a lot
Dino

@hammoudma
Copy link

Is there a way tk completely disable numbering

@buhtz
Copy link

buhtz commented Dec 2, 2022

I did some refactoring and fixing on the code you all posted here. I'm not sure what happens here but it seems to work.
The code below does produce that Word document.

image

You see the number is missing. Do mark the whole document (CMD+A) and then press F9 to update all fields. Then the number appears.

image

I'm currently looking for a solution to let the numbering appear in the first place. Not sure how word does that.

The code

That code should work for tables. It is also prepared to work for figures, too. But I never used figures with docx so I don't know how to create them. There is also the MarkIndexEntry() function. I don't know what it does or for what I need it.

import pathlib
import subprocess
import docx
from docx import Document
from docx.shared import Cm
from docx.oxml import OxmlElement
from docx.oxml.ns import qn


def MarkIndexEntry(entry,paragraph):
    run = paragraph.add_run()
    r = run._r
    fldChar = OxmlElement('w:fldChar')
    fldChar.set(qn('w:fldCharType'), 'begin')
    r.append(fldChar)

    run = paragraph.add_run()
    r = run._r
    instrText = OxmlElement('w:instrText')
    instrText.set(qn('xml:space'), 'preserve')
    instrText.text = ' XE "%s" '%(entry)
    r.append(instrText)

    run = paragraph.add_run()
    r = run._r
    fldChar = OxmlElement('w:fldChar')
    fldChar.set(qn('w:fldCharType'), 'end')
    r.append(fldChar)


def add_caption(document, tab_or_figure, caption):
    target = {
        docx.table.Table: 'Table',
        # docx.figure.Figure: 'Figure'
    }[type(tab_or_figure)]

    # caption type
    paragraph = document.add_paragraph(f'{target} ', style='Caption')

    # numbering field
    run = paragraph.add_run()

    fldChar = OxmlElement('w:fldChar')
    fldChar.set(qn('w:fldCharType'), 'begin')
    run._r.append(fldChar)

    instrText = OxmlElement('w:instrText')
    instrText.text = f' SEQ {target} \\* ARABIC'
    run._r.append(instrText)

    fldChar = OxmlElement('w:fldChar')
    fldChar.set(qn('w:fldCharType'), 'end')
    run._r.append(fldChar)

    # caption text
    paragraph.add_run(f' {caption}')


if __name__ == '__main__':
    document = Document()

    tab = document.add_table(3, 3)
    add_caption(document, tab, 'Description of the sample')

    fp = pathlib.Path('w.docx')
    document.save(str(fp))  # python3-docx can't handle Path
    subprocess.run(['CMD.EXE', '/C', 'start', fp])

@buhtz
Copy link

buhtz commented Mar 1, 2023

Please let ask an additional option including a possible solution. Please let me know if my approach make sense to you with your knowledge about the docx internals.

I would like modify the position of a caption for a table/figure from bottom to top.

Is it that easy to just create the caption paragraph before the table object? Or should we do some more "magic"?

@keckler
Copy link

keckler commented Apr 9, 2024

This is a feature request that has received a lot of interest, and a mostly-working example of how to accomplish this has been posted in the comments above. Is there a reason why this cannot be incorporated into the codebase?

@buhtz
Copy link

buhtz commented Apr 10, 2024

The project is in maintenance mode. This means no new features. The maintainer only does fix bugs or answer some questions from time to time. I am sure he will appreciate every help he can get.

@keckler
Copy link

keckler commented Apr 10, 2024

Huh.

Well, if anybody wants to use caption numbering like Figure X-Y where X are the section numbers, you can do it with something like this (adapted from @buhtz 's code above):

def addCaption(doc, captionType, captionText):
    """
    Use this to insert a caption with dynamic numbering into the document.

    When you open the word doc, you must do ctrl+A -> F9 in order for the dynamic
    numbers to actually appear. Numbering of tables and figures will be independent.

    Parameters
    ----------
    doc : docx.Document
    captionType : str
        The type of caption. Either "Figure" or "Table".
    captionText : str
        The description that is the bulk of the caption.
    """
    paragraph = doc.add_paragraph(f"{captionType} ", style="Caption")
    paragraph.alignment = docx.enum.text.WD_ALIGN_PARAGRAPH.CENTER

    # add in the part of the numbering corresponding to the section
    # i.e. X in Figure X-Y
    run = paragraph.add_run()
    fldChar = docx.oxml.OxmlElement("w:fldChar")
    fldChar.set(docx.oxml.ns.qn("w:fldCharType"), "begin")
    run._r.append(fldChar)
    instrText = docx.oxml.OxmlElement("w:instrText")
    instrText.text = fr" STYLEREF 1 \s "
    run._r.append(instrText)
    fldChar = docx.oxml.OxmlElement("w:fldChar")
    fldChar.set(docx.oxml.ns.qn("w:fldCharType"), "end")
    run._r.append(fldChar)

    paragraph.add_run("-")

    # add the part of the numbering corresponding to the index within the section
    # i.e. Y in Figure X-Y
    run = paragraph.add_run()
    fldChar = docx.oxml.OxmlElement("w:fldChar")
    fldChar.set(docx.oxml.ns.qn("w:fldCharType"), "begin")
    run._r.append(fldChar)
    instrText = docx.oxml.OxmlElement("w:instrText")
    instrText.text = fr" SEQ {captionType} \* ARABIC \s 1 "
    run._r.append(instrText)
    fldChar = docx.oxml.OxmlElement("w:fldChar")
    fldChar.set(docx.oxml.ns.qn("w:fldCharType"), "end")
    run._r.append(fldChar)

    paragraph.add_run(f": {captionText}")

@sailist
Copy link

sailist commented Nov 19, 2024

import pathlib
import subprocess
import docx
from docx import Document
from docx.shared import Cm
from docx.oxml import OxmlElement
from docx.oxml.ns import qn


def pre_ref(run, cross_refs: list):
    for cref in cross_refs:
        bookmarkStart = OxmlElement("w:bookmarkStart")
        bookmarkStart.set(qn("w:id"), f"{cref.attrs['w:id']}")
        bookmarkStart.set(qn("w:name"), cref.attrs["w:name"])
        run._r.append(bookmarkStart)


def post_ref(run, cross_refs: list):
    for cref in cross_refs:
        bookmarkEnd = OxmlElement("w:bookmarkEnd")
        bookmarkEnd.set(qn("w:id"), f"{cref.attrs['w:id']}")
        run._r.append(bookmarkEnd)


def add_label(paragraph, label_type, refname: str, cross_refs: list, prefix=""):
    paragraph.add_run("", style="Caption")

    # numbering field
    run = paragraph.add_run(" ")
    pre_ref(run, cross_refs)

    if len(prefix) > 0:
        run.add_text(f"{prefix} ")
    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "begin")
    run._r.append(fldChar)

    instrText = OxmlElement("w:instrText")
    instrText.text = f" SEQ {label_type} \\* ARABIC"
    run._r.append(instrText)

    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "end")

    run._r.append(fldChar)
    run.add_text(". ")
    post_ref(run, cross_refs)

image
image

@keckler
Copy link

keckler commented Nov 19, 2024

Hi @sailist , do you care to add some description of what your code does?

@sailist
Copy link

sailist commented Dec 13, 2024

Hi @sailist , do you care to add some description of what your code does?

Here is an example you can refer to:

from docx import Document
from docx.oxml import OxmlElement
from docx.oxml.ns import qn


def pre_ref(run, cross_refs: list):
    for cref in cross_refs:
        bookmarkStart = OxmlElement("w:bookmarkStart")
        bookmarkStart.set(qn("w:id"), f"{cref['attrs']['w:id']}")
        bookmarkStart.set(qn("w:name"), cref["attrs"]["w:name"])
        run._r.append(bookmarkStart)


def post_ref(run, cross_refs: list):
    for cref in cross_refs:
        bookmarkEnd = OxmlElement("w:bookmarkEnd")
        bookmarkEnd.set(qn("w:id"), f"{cref['attrs']['w:id']}")
        run._r.append(bookmarkEnd)


def add_label(paragraph, label_type, refname: str, cross_refs: list, prefix=""):
    paragraph.add_run("")

    # numbering field
    run = paragraph.add_run(" ")
    pre_ref(run, cross_refs)

    if len(prefix) > 0:
        run.add_text(f"{prefix} ")
    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "begin")
    run._r.append(fldChar)

    instrText = OxmlElement("w:instrText")
    instrText.text = f" SEQ {label_type} \\* ARABIC"
    run._r.append(instrText)

    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "end")

    run._r.append(fldChar)
    run.add_text(". ")
    post_ref(run, cross_refs)


def add_ref_place(paragraph, token):
    # caption type
    run = paragraph.add_run("")

    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "begin")
    run._r.append(fldChar)

    ref_name = token["attrs"]["w:name"]
    instrText = OxmlElement("w:instrText")
    instrText.text = f" REF {ref_name} \\h"
    run._r.append(instrText)

    fldChar = OxmlElement("w:fldChar")
    fldChar.set(qn("w:fldCharType"), "end")

    run._r.append(fldChar)


doc = Document()

# pre analysys before generate docx to get all cross_refs
cross_refs = [{"attrs": {"w:id": "1", "w:name": "bookmark1"}}]

# that makes sure you can add reference before label position
paragraph = doc.add_paragraph()
add_ref_place(paragraph, cross_refs[0])


paragraph = doc.add_paragraph()
add_label(
    paragraph, label_type="Figure", refname="fig1", cross_refs=cross_refs, prefix="Fig"
)


for i in range(10):
    paragraph = doc.add_paragraph(f"i: {i}")

# add reference after label position
paragraph = doc.add_paragraph()
add_ref_place(paragraph, cross_refs[0])

doc.save("output.docx")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests