How to merge these commands into one?

Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
This is what I am wanting to do:
Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.
I have done research and come across the following commands:
find . -name *.md -type f -exec pandoc -o .txt ;
This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"
find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;
This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.
pandoc -f html -t markdown -s input.html -o output.md
This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open
What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.
bash find markdown pandoc yaml
add a comment |
up vote
3
down vote
favorite
This is what I am wanting to do:
Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.
I have done research and come across the following commands:
find . -name *.md -type f -exec pandoc -o .txt ;
This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"
find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;
This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.
pandoc -f html -t markdown -s input.html -o output.md
This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open
What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.
bash find markdown pandoc yaml
Break this down for us a little, please. (1) What are your input filenames like:a.html,b.html.md,c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
– Scott
Mar 14 '15 at 4:08
(1) They are alla.html(2)Converta.htmlintoa.mdwhich includes converting the XML metadata in the header ofa.htmlinto YAML to be used asa.md's front matter.
– st john smith
Mar 14 '15 at 6:37
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to usepandoc. I've never heard ofpandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
– Scott
Mar 14 '15 at 6:53
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
This is what I am wanting to do:
Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.
I have done research and come across the following commands:
find . -name *.md -type f -exec pandoc -o .txt ;
This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"
find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;
This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.
pandoc -f html -t markdown -s input.html -o output.md
This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open
What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.
bash find markdown pandoc yaml
This is what I am wanting to do:
Convert a folder of HTML files into markdown, also copying over the the xml metadata of each of the HTML files by converting into YAML.
I have done research and come across the following commands:
find . -name *.md -type f -exec pandoc -o .txt ;
This was found here, and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"
find / -name "*.md" -type f -exec sh -c 'markdown "$0" > "$0%.md.html"' ;
This was found here. This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.
pandoc -f html -t markdown -s input.html -o output.md
This was found here. This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open
What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the xml metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.
bash find markdown pandoc yaml
bash find markdown pandoc yaml
edited Nov 18 at 6:52
Rui F Ribeiro
38.2k1475123
38.2k1475123
asked Mar 14 '15 at 3:35
st john smith
243
243
Break this down for us a little, please. (1) What are your input filenames like:a.html,b.html.md,c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
– Scott
Mar 14 '15 at 4:08
(1) They are alla.html(2)Converta.htmlintoa.mdwhich includes converting the XML metadata in the header ofa.htmlinto YAML to be used asa.md's front matter.
– st john smith
Mar 14 '15 at 6:37
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to usepandoc. I've never heard ofpandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
– Scott
Mar 14 '15 at 6:53
add a comment |
Break this down for us a little, please. (1) What are your input filenames like:a.html,b.html.md,c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)
– Scott
Mar 14 '15 at 4:08
(1) They are alla.html(2)Converta.htmlintoa.mdwhich includes converting the XML metadata in the header ofa.htmlinto YAML to be used asa.md's front matter.
– st john smith
Mar 14 '15 at 6:37
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to usepandoc. I've never heard ofpandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.
– Scott
Mar 14 '15 at 6:53
Break this down for us a little, please. (1) What are your input filenames like:
a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)– Scott
Mar 14 '15 at 4:08
Break this down for us a little, please. (1) What are your input filenames like:
a.html, b.html.md, c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)– Scott
Mar 14 '15 at 4:08
(1) They are all
a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.– st john smith
Mar 14 '15 at 6:37
(1) They are all
a.html (2)Convert a.html into a.md which includes converting the XML metadata in the header of a.html into YAML to be used as a.md's front matter.– st john smith
Mar 14 '15 at 6:37
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use
pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.– Scott
Mar 14 '15 at 6:53
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use
pandoc. I've never heard of pandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.– Scott
Mar 14 '15 at 6:53
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
What you need is xargs. I am not familiar with pandoc, but something like this should work:
$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"
This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.
You should end up with your original html files and an equal number of matching md files in the same directory.
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
What you need is xargs. I am not familiar with pandoc, but something like this should work:
$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"
This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.
You should end up with your original html files and an equal number of matching md files in the same directory.
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
add a comment |
up vote
1
down vote
accepted
What you need is xargs. I am not familiar with pandoc, but something like this should work:
$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"
This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.
You should end up with your original html files and an equal number of matching md files in the same directory.
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
What you need is xargs. I am not familiar with pandoc, but something like this should work:
$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"
This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.
You should end up with your original html files and an equal number of matching md files in the same directory.
What you need is xargs. I am not familiar with pandoc, but something like this should work:
$ find . -name *.html -type f | sed 's/.html$//' | xargs -I pandoc -f html -t markdown -s -o ".md" ".html"
This uses 'find' to list all the .html files in your chosen directory (and any sub-directories). These are piped to sed which strips off the '.html' extension and then piped to xargs which feeds them one-by-one into pandoc; pandoc (if I have used the syntax correctly) then takes each name (substitued for ), uses each html file as source and outputs to a new file with md extension in the same directory as the source file.
You should end up with your original html files and an equal number of matching md files in the same directory.
answered Mar 14 '15 at 13:33
gogoud
1,680716
1,680716
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
add a comment |
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
This seems to have worked! Thank you so so much! Honestly, you dont know how much this has helped me, I cannot thank you enough. (sorry for the late reply)
– st john smith
Mar 20 '15 at 18:23
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f190082%2fhow-to-merge-these-commands-into-one%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Break this down for us a little, please. (1) What are your input filenames like:
a.html,b.html.md,c.md, or a mixture? (2) For each individual input file, what command(s) do you want/need to run, and what do you want the output files to be called? (If you don't know the answer to (2), focus on researching that before you muddy the issue by trying to determine how to process multiple files.)– Scott
Mar 14 '15 at 4:08
(1) They are all
a.html(2)Converta.htmlintoa.mdwhich includes converting the XML metadata in the header ofa.htmlinto YAML to be used asa.md's front matter.– st john smith
Mar 14 '15 at 6:37
(1) I trust you can see that «the file extensions are ".html.md" not ".md"» is confusing, if, in fact, the file extensions are all ".html". (2) I said, "what command(s) do you want/need to run". Upon rereading your question, I guess you're implying that you want to use
pandoc. I've never heard ofpandoc, so I didn't know that it does both of the functions that you want (convert HTML and copy/convert XML metadata), and your references to "three commands" confused me. (3) Comments are a bad place for clarifications. Improve your question by editing it.– Scott
Mar 14 '15 at 6:53