How to merge these commands into one?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
1












This is what I am wanting to do:



Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.



I have done research and come across the following commands:




  1. find . -name *.md -type f -exec pandoc -o .txt ;




    • This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"



  2. find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;




    • This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.



  3. pandoc -f html -t markdown -s input.html -o output.md




    • This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open


What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.










share|improve this question























  • Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
    – Scott
    Mar 14 '15 at 4:08










  • (1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
    – st john smith
    Mar 14 '15 at 6:37










  • (1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
    – Scott
    Mar 14 '15 at 6:53














up vote
3
down vote

favorite
1












This is what I am wanting to do:



Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.



I have done research and come across the following commands:




  1. find . -name *.md -type f -exec pandoc -o .txt ;




    • This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"



  2. find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;




    • This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.



  3. pandoc -f html -t markdown -s input.html -o output.md




    • This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open


What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.










share|improve this question























  • Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
    – Scott
    Mar 14 '15 at 4:08










  • (1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
    – st john smith
    Mar 14 '15 at 6:37










  • (1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
    – Scott
    Mar 14 '15 at 6:53












up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





This is what I am wanting to do:



Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.



I have done research and come across the following commands:




  1. find . -name *.md -type f -exec pandoc -o .txt ;




    • This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"



  2. find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;




    • This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.



  3. pandoc -f html -t markdown -s input.html -o output.md




    • This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open


What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.










share|improve this question















This is what I am wanting to do:



Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.



I have done research and come across the following commands:




  1. find . -name *.md -type f -exec pandoc -o .txt ;




    • This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"



  2. find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;




    • This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.



  3. pandoc -f html -t markdown -s input.html -o output.md




    • This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open


What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.







bash find markdown pandoc yaml






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 18 at 6:52









Rui F Ribeiro

38.2k1475123




38.2k1475123










asked Mar 14 '15 at 3:35









st john smith

243




243











  • Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
    – Scott
    Mar 14 '15 at 4:08










  • (1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
    – st john smith
    Mar 14 '15 at 6:37










  • (1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
    – Scott
    Mar 14 '15 at 6:53
















  • Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
    – Scott
    Mar 14 '15 at 4:08










  • (1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
    – st john smith
    Mar 14 '15 at 6:37










  • (1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
    – Scott
    Mar 14 '15 at 6:53















Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
– Scott
Mar 14 '15 at 4:08




Break this down for us a little, please. (1) What are your input filenames like: a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
– Scott
Mar 14 '15 at 4:08












(1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
– st john smith
Mar 14 '15 at 6:37




(1) They are all a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.
– st john smith
Mar 14 '15 at 6:37












(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
– Scott
Mar 14 '15 at 6:53




(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
– Scott
Mar 14 '15 at 6:53










1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










What you need is xargs. I am not familiar with pandoc, but something like this should work:



$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"


This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.



You should end up with your original html files and an equal number of matching md files in the same directory.






share|improve this answer




















  • This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
    – st john smith
    Mar 20 '15 at 18:23










Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f190082%2fhow-to-merge-these-commands-into-one%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










What you need is xargs. I am not familiar with pandoc, but something like this should work:



$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"


This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.



You should end up with your original html files and an equal number of matching md files in the same directory.






share|improve this answer




















  • This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
    – st john smith
    Mar 20 '15 at 18:23














up vote
1
down vote



accepted










What you need is xargs. I am not familiar with pandoc, but something like this should work:



$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"


This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.



You should end up with your original html files and an equal number of matching md files in the same directory.






share|improve this answer




















  • This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
    – st john smith
    Mar 20 '15 at 18:23












up vote
1
down vote



accepted







up vote
1
down vote



accepted






What you need is xargs. I am not familiar with pandoc, but something like this should work:



$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"


This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.



You should end up with your original html files and an equal number of matching md files in the same directory.






share|improve this answer












What you need is xargs. I am not familiar with pandoc, but something like this should work:



$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"


This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.



You should end up with your original html files and an equal number of matching md files in the same directory.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 14 '15 at 13:33









gogoud

1,680716




1,680716











  • This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
    – st john smith
    Mar 20 '15 at 18:23
















  • This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
    – st john smith
    Mar 20 '15 at 18:23















This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23




This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f190082%2fhow-to-merge-these-commands-into-one%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

Peggy Mitchell

Palaiologos

The Forum (Inglewood, California)