Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal links not working after processing weasyprint generated html to pdf file #2369

Open
prabhakaran8737 opened this issue Feb 3, 2025 · 11 comments

Comments

@prabhakaran8737
Copy link

prabhakaran8737 commented Feb 3, 2025

I have generated a PDF file with table of content and other internal links in this format

<div class="page headerconf" id="toc" style="page-break-before:avoid;page-break-after:avoid;margin-top:-30px;">
    <div>
        <h1>Table of Contents</h1>
        <ol>
            <li>
                <a href="#project">
                    <span>Project Details</span>
                </a>
            </li>
            <li>
                <a href="#others">
                    <span>Others </span>
                </a>
            </li>
...
...
...
        </ol>
    </div>
</div>

And this is how my content goes,

<div class="main" id="contents">
    <!DOCTYPE html>
    <html>
        <head lang="en">
            <meta charset="UTF-8">
            <title></title>
        </head>
        <body>
            <a name="project" />
            <div id="project" style="width:100%;page-break-before : always;page-break-after : always;font-size:13px;margin-top:-30px;">
                <h3 class="heading">Project Details</h3>
                <br>
                <div>
                </div>
            </div>
        </body>
    </html>
</div>

This has generated properly and the links are working no problem until here. After the pdf generated we send this file to Java itext to download all attachments(image, pdf) and merge as a single pdf. After this process all the internal links are stopped working, but those links to downloaded and attached files are working.

Before Weasyprint we have been using Docraptor with the same html content and then convert to pdf which is working fine with Java itext.

I strongly believe Docraptor add some HTML or CSS which make is compatible. Can anyone share your thoughts on it. Thanks.

@prabhakaran8737 prabhakaran8737 changed the title Internal links not after processing weasyprint generated html to pdf file Internal links not working after processing weasyprint generated html to pdf file Feb 3, 2025
@liZe
Copy link
Member

liZe commented Feb 3, 2025

Hi!

If the PDF generated by WeasyPrint includes links that work, and if the links disappear when the PDFs are merged with iText, then the bug is in iText, isn’t it?

There are different valid ways of handling links in PDF, and I suppose that iText doesn’t support all of them.

@prabhakaran8737
Copy link
Author

@liZe thank you for the response. I understand your perspective but when I use Docraptor to generator the PDF with the same HTML, itext can handle it correctly, this is where I am getting distracted.

@liZe
Copy link
Member

liZe commented Feb 3, 2025

I understand your perspective but when I use Docraptor to generator the PDF with the same HTML, itext can handle it correctly, this is where I am getting distracted.

Yes, it probably means that Docraptor uses another way to create links, and that this way works with iText. Or maybe there’s an option you can give to iText to keep these links. Or maybe there’s a bug in iText with your documents. Or maybe something else.

As long as we don’t know why iText doesn’t keep these links, there’s no way for us to do anything.

@prabhakaran8737
Copy link
Author

Okay. Thanks and I come across 3 ways to generate link during the conversion,

  1. Annotation
  2. Bookmark
  3. Named destination

May I know which of the above is used by Weasyprint?

@liZe
Copy link
Member

liZe commented Feb 3, 2025

May I know which of the above is used by Weasyprint?

Annotations

@prabhakaran8737
Copy link
Author

Okay thank you.

@prabhakaran8737
Copy link
Author

Is there any way in weasyprint to change from annotations to Named destination or bookmark? Any attributes we can change and test it?

@liZe
Copy link
Member

liZe commented Feb 3, 2025

Is there any way in weasyprint to change from annotations to Named destination or bookmark? Any attributes we can change and test it?

I’m not sure it’s a good idea. What would happen if someone else reports that another third-party tool supports annotations, but not named destinations or bookmarks?

Why not adding a feature in iText instead of fixing a non-bug in WeasyPrint? Or maybe use another tool than iText, that merges PDFs and keeps links (maybe Ghostscript or Poppler does that).

@prabhakaran8737
Copy link
Author

Okay thank you.

@liZe
Copy link
Member

liZe commented Feb 3, 2025

Okay thank you.

Before closing, do you know if there’s a bug tracker for iText, so that we can add a link for other users who may have the same problem?

@prabhakaran8737
Copy link
Author

I'll be checking it today. If I have one I will definitely update it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants