Render PDF as an image and extract hyperlinks

I use imagemagick to render PDF (generated pdfLaTex) as the image:

convert -density 120 test.pdf -trim test.png

Then I use this image in an HTML file (to include latex code in my own wiki engine).

But, of course, the PNG file does not have the hyperlink contained in the PDF file.

Is it possible to retrieve the coordinates and destination URLs of hyperlinks, so I can create an HTML map?

If that matters: I only need external (http: //) hyperlinks, as well as hyperlinks to PDF-internal content. A text solution, such as pdftohtml, would be unacceptable, since there are graphics and formats in the PDF files.

+3
source share
2 answers

Imagemagick Ghostscript PDF . Ghostscript Link. PDF pdfwrite, PDF , .

PostScript, , .

gs/Resource/Init pdf_main.ps PDF. :

  /Link {
    mark exch
    dup /BS knownoget { << exch { oforce } forall >> /BS exch 3 -1 roll } if
    dup /F knownoget { /F exch 3 -1 roll } if
    dup /C knownoget { /Color exch 3 -1 roll } if
    dup /Rect knownoget { /Rect exch 3 -1 roll } if
    dup /Border knownoget {
....
    } if
    { linkdest } stopped 

( PDF). "linkdest" PostScript, , . , -dDOPDFMARKS , , .

+2
0

All Articles