Replace a image in a PDF using command line

I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.

I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.

asked Jun 8 '15 at 11:31

Ivan

154116

1

If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…

– Konstantinos
Jun 10 '15 at 0:00

Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.

– Ivan
Jun 10 '15 at 13:33

lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.

– math
Sep 14 '17 at 14:06

add a comment |

I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.

asked Jun 8 '15 at 11:31

Ivan

154116

1

If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…

– Konstantinos
Jun 10 '15 at 0:00

Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.

– Ivan
Jun 10 '15 at 13:33

lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.

– math
Sep 14 '17 at 14:06

add a comment |

I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.

asked Jun 8 '15 at 11:31

Ivan

154116

I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.

command-line pdf images

asked Jun 8 '15 at 11:31

Ivan

154116

asked Jun 8 '15 at 11:31

Ivan

154116

asked Jun 8 '15 at 11:31

Ivan

154116

asked Jun 8 '15 at 11:31

Ivan

154116

asked Jun 8 '15 at 11:31

Ivan

154116

1

If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…

– Konstantinos
Jun 10 '15 at 0:00

Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.

– Ivan
Jun 10 '15 at 13:33

lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.

– math
Sep 14 '17 at 14:06

add a comment |

1

If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…

– Konstantinos
Jun 10 '15 at 0:00

Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.

– Ivan
Jun 10 '15 at 13:33

lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.

– math
Sep 14 '17 at 14:06

If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…

– Konstantinos
Jun 10 '15 at 0:00

Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.

– Ivan
Jun 10 '15 at 13:33

lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.

– math
Sep 14 '17 at 14:06

add a comment |

1 Answer
1

active

oldest

votes

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

The OP's objection to the imagemagick solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout option. The OP could identify the correct page of text and chuck it into a .tex file which ends with an %includegraphics directive and refers to the replacement picture by filename. You then pdflatex this and end up with a new single-page .pdf to insert into the rest of your document with pdftk. If you knew where in the text of the original page the image resided, you can %includegraphics [h] and get the image in exactly the right place.

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f208232%2freplace-a-image-in-a-pdf-using-command-line%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

add a comment |

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

add a comment |

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

OK ... I think pdflatex is the missing piece here.

The OP said he has looked into poppler-utils and pdftk. Let me add to that pdfimages. These, together with pdflatex are the pieces of a solution.

pdfimages -f 4 -l 20 -j -png target.pdf imageroot

In the example code above, pdfimages looks through pages 4 through 20 of target.pdf and extracts all images into files with names beginning imageroot.

poppler-utils provides pdftotext. I recommend the -layout option which does a great job keeping the document human readable.

pdftotext -layout $1.pdf $1.txt

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

answered Nov 12 '17 at 18:36

Richard Sonnenfeld

211

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu