Replace a image in a PDF using command line
Clash Royale CLAN TAG#URR8PPP
I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.
I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.
command-line pdf images
add a comment |
I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.
I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.
command-line pdf images
1
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06
add a comment |
I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.
I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.
command-line pdf images
I need to process some PDF files. The task consists in exchange a given image file by another. My first problem is how to replace a PDF image from command line in a batch process. Next I'll try to address other problems like how to identify which is the image I need to replace (because the PDF files may have more than one image). But first I want to resolve the first problem: how to replace a image in a PDF by another.
I've read about poppler-utils and pdftk but as far as I Know, none of these tools allow to replace images into PDF.
command-line pdf images
command-line pdf images
asked Jun 8 '15 at 11:31
IvanIvan
154116
154116
1
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06
add a comment |
1
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06
1
1
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06
add a comment |
1 Answer
1
active
oldest
votes
OK ... I think pdflatex
is the missing piece here.
The OP said he has looked into poppler-utils
and pdftk
. Let me add to that pdfimages
. These, together with pdflatex
are the pieces of a solution.
pdfimages -f 4 -l 20 -j -png target.pdf imageroot
In the example code above, pdfimages
looks through pages 4 through 20 of target.pdf
and extracts all images into files with names beginning imageroot
.
poppler-utils
provides pdftotext
. I recommend the -layout
option which does a great job keeping the document human readable.
pdftotext -layout $1.pdf $1.txt
The OP's objection to the imagemagick
solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout
option. The OP could identify the correct page of text and chuck it into a .tex
file which ends with an %includegraphics
directive and refers to the replacement picture by filename. You then pdflatex
this and end up with a new single-page .pdf to insert into the rest of your document with pdftk
. If you knew where in the text of the original page the image resided, you can %includegraphics [h]
and get the image in exactly the right place.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f208232%2freplace-a-image-in-a-pdf-using-command-line%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
OK ... I think pdflatex
is the missing piece here.
The OP said he has looked into poppler-utils
and pdftk
. Let me add to that pdfimages
. These, together with pdflatex
are the pieces of a solution.
pdfimages -f 4 -l 20 -j -png target.pdf imageroot
In the example code above, pdfimages
looks through pages 4 through 20 of target.pdf
and extracts all images into files with names beginning imageroot
.
poppler-utils
provides pdftotext
. I recommend the -layout
option which does a great job keeping the document human readable.
pdftotext -layout $1.pdf $1.txt
The OP's objection to the imagemagick
solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout
option. The OP could identify the correct page of text and chuck it into a .tex
file which ends with an %includegraphics
directive and refers to the replacement picture by filename. You then pdflatex
this and end up with a new single-page .pdf to insert into the rest of your document with pdftk
. If you knew where in the text of the original page the image resided, you can %includegraphics [h]
and get the image in exactly the right place.
add a comment |
OK ... I think pdflatex
is the missing piece here.
The OP said he has looked into poppler-utils
and pdftk
. Let me add to that pdfimages
. These, together with pdflatex
are the pieces of a solution.
pdfimages -f 4 -l 20 -j -png target.pdf imageroot
In the example code above, pdfimages
looks through pages 4 through 20 of target.pdf
and extracts all images into files with names beginning imageroot
.
poppler-utils
provides pdftotext
. I recommend the -layout
option which does a great job keeping the document human readable.
pdftotext -layout $1.pdf $1.txt
The OP's objection to the imagemagick
solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout
option. The OP could identify the correct page of text and chuck it into a .tex
file which ends with an %includegraphics
directive and refers to the replacement picture by filename. You then pdflatex
this and end up with a new single-page .pdf to insert into the rest of your document with pdftk
. If you knew where in the text of the original page the image resided, you can %includegraphics [h]
and get the image in exactly the right place.
add a comment |
OK ... I think pdflatex
is the missing piece here.
The OP said he has looked into poppler-utils
and pdftk
. Let me add to that pdfimages
. These, together with pdflatex
are the pieces of a solution.
pdfimages -f 4 -l 20 -j -png target.pdf imageroot
In the example code above, pdfimages
looks through pages 4 through 20 of target.pdf
and extracts all images into files with names beginning imageroot
.
poppler-utils
provides pdftotext
. I recommend the -layout
option which does a great job keeping the document human readable.
pdftotext -layout $1.pdf $1.txt
The OP's objection to the imagemagick
solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout
option. The OP could identify the correct page of text and chuck it into a .tex
file which ends with an %includegraphics
directive and refers to the replacement picture by filename. You then pdflatex
this and end up with a new single-page .pdf to insert into the rest of your document with pdftk
. If you knew where in the text of the original page the image resided, you can %includegraphics [h]
and get the image in exactly the right place.
OK ... I think pdflatex
is the missing piece here.
The OP said he has looked into poppler-utils
and pdftk
. Let me add to that pdfimages
. These, together with pdflatex
are the pieces of a solution.
pdfimages -f 4 -l 20 -j -png target.pdf imageroot
In the example code above, pdfimages
looks through pages 4 through 20 of target.pdf
and extracts all images into files with names beginning imageroot
.
poppler-utils
provides pdftotext
. I recommend the -layout
option which does a great job keeping the document human readable.
pdftotext -layout $1.pdf $1.txt
The OP's objection to the imagemagick
solution offered by pidosaurus is that
an image does not have extractable text. With the utilities I outlined, the OP will now have all the images as well as all the extracted text, and page numbers and contents are retained by the -layout
option. The OP could identify the correct page of text and chuck it into a .tex
file which ends with an %includegraphics
directive and refers to the replacement picture by filename. You then pdflatex
this and end up with a new single-page .pdf to insert into the rest of your document with pdftk
. If you knew where in the text of the original page the image resided, you can %includegraphics [h]
and get the image in exactly the right place.
answered Nov 12 '17 at 18:36
Richard SonnenfeldRichard Sonnenfeld
211
211
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f208232%2freplace-a-image-in-a-pdf-using-command-line%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
If you find an answer it will be really interesting to know. After isolating the "problem page", you could use ImageMagick to insert an image into another and then convert it back to pdf: imagemagick.org/Usage/layers Also: superuser.com/questions/614784/…
– Konstantinos
Jun 10 '15 at 0:00
Thanks @pidosaurus I was cosidering this options but it has a big problem: it implies to convert the PDF (or the signature page) to images. It's a problem because the resulting PDF will be much bigger in size and the user couldn't select a piece of text to copy and paste for instance.
– Ivan
Jun 10 '15 at 13:33
lookup convert from imagemagick, and more so, the tools that ooconv from openoffice (now libreoffice, actually) provide -- I once hired someone to write a PPT to PDF convertor and these were the tools used.
– math
Sep 14 '17 at 14:06