Find only GUIDs in file - Bash

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












4















I have a file that might contain GUIDs (their canonical textual representation).



I want to do an action for each GUID in the file. It might contain any number of GUIDs.



I have already a file ready for reading. How do I spot the GUIDS?



I know I need to use while read FILENAME



An example of my file :



GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS









share|improve this question
























  • Post your sample file.

    – Tuyen Pham
    Jan 15 at 7:44











  • You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

    – Xen2050
    Jan 15 at 7:46











  • I wrote a file as example

    – MathEnthusiast
    Jan 15 at 7:47











  • What's the action you want to perform? It alters the possible solution

    – roaima
    Jan 15 at 7:49











  • I need to run a command and then wait 5 seconds

    – MathEnthusiast
    Jan 15 at 7:50















4















I have a file that might contain GUIDs (their canonical textual representation).



I want to do an action for each GUID in the file. It might contain any number of GUIDs.



I have already a file ready for reading. How do I spot the GUIDS?



I know I need to use while read FILENAME



An example of my file :



GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS









share|improve this question
























  • Post your sample file.

    – Tuyen Pham
    Jan 15 at 7:44











  • You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

    – Xen2050
    Jan 15 at 7:46











  • I wrote a file as example

    – MathEnthusiast
    Jan 15 at 7:47











  • What's the action you want to perform? It alters the possible solution

    – roaima
    Jan 15 at 7:49











  • I need to run a command and then wait 5 seconds

    – MathEnthusiast
    Jan 15 at 7:50













4












4








4








I have a file that might contain GUIDs (their canonical textual representation).



I want to do an action for each GUID in the file. It might contain any number of GUIDs.



I have already a file ready for reading. How do I spot the GUIDS?



I know I need to use while read FILENAME



An example of my file :



GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS









share|improve this question
















I have a file that might contain GUIDs (their canonical textual representation).



I want to do an action for each GUID in the file. It might contain any number of GUIDs.



I have already a file ready for reading. How do I spot the GUIDS?



I know I need to use while read FILENAME



An example of my file :



GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS






bash shell-script scripting wildcards






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 15 at 8:04









Stéphane Chazelas

303k57570926




303k57570926










asked Jan 15 at 7:41









MathEnthusiastMathEnthusiast

233




233












  • Post your sample file.

    – Tuyen Pham
    Jan 15 at 7:44











  • You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

    – Xen2050
    Jan 15 at 7:46











  • I wrote a file as example

    – MathEnthusiast
    Jan 15 at 7:47











  • What's the action you want to perform? It alters the possible solution

    – roaima
    Jan 15 at 7:49











  • I need to run a command and then wait 5 seconds

    – MathEnthusiast
    Jan 15 at 7:50

















  • Post your sample file.

    – Tuyen Pham
    Jan 15 at 7:44











  • You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

    – Xen2050
    Jan 15 at 7:46











  • I wrote a file as example

    – MathEnthusiast
    Jan 15 at 7:47











  • What's the action you want to perform? It alters the possible solution

    – roaima
    Jan 15 at 7:49











  • I need to run a command and then wait 5 seconds

    – MathEnthusiast
    Jan 15 at 7:50
















Post your sample file.

– Tuyen Pham
Jan 15 at 7:44





Post your sample file.

– Tuyen Pham
Jan 15 at 7:44













You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

– Xen2050
Jan 15 at 7:46





You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly

– Xen2050
Jan 15 at 7:46













I wrote a file as example

– MathEnthusiast
Jan 15 at 7:47





I wrote a file as example

– MathEnthusiast
Jan 15 at 7:47













What's the action you want to perform? It alters the possible solution

– roaima
Jan 15 at 7:49





What's the action you want to perform? It alters the possible solution

– roaima
Jan 15 at 7:49













I need to run a command and then wait 5 seconds

– MathEnthusiast
Jan 15 at 7:50





I need to run a command and then wait 5 seconds

– MathEnthusiast
Jan 15 at 7:50










2 Answers
2






active

oldest

votes


















4














With the GNU implementation of grep (or compatible):



<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done


Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).



GNU grep has a -o option that prints the non-empty matches of the regular expression.



-w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:




aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa


The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.






share|improve this answer

























  • Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

    – MathEnthusiast
    Jan 15 at 7:51












  • Also - what is GNU tools ?

    – MathEnthusiast
    Jan 15 at 7:53











  • @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

    – Stéphane Chazelas
    Jan 15 at 8:01











  • @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

    – Noach
    Jan 15 at 9:58











  • @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

    – Noach
    Jan 15 at 10:01


















2














While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:



$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done


Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:



egrep '^[0-9a-f-]36$'






share|improve this answer
























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f494546%2ffind-only-guids-in-file-bash%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4














    With the GNU implementation of grep (or compatible):



    <your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
    while IFS= read -r guid; do
    your-action "$guid"
    sleep 5
    done


    Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).



    GNU grep has a -o option that prints the non-empty matches of the regular expression.



    -w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:




    aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa


    The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.






    share|improve this answer

























    • Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

      – MathEnthusiast
      Jan 15 at 7:51












    • Also - what is GNU tools ?

      – MathEnthusiast
      Jan 15 at 7:53











    • @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

      – Stéphane Chazelas
      Jan 15 at 8:01











    • @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

      – Noach
      Jan 15 at 9:58











    • @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

      – Noach
      Jan 15 at 10:01















    4














    With the GNU implementation of grep (or compatible):



    <your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
    while IFS= read -r guid; do
    your-action "$guid"
    sleep 5
    done


    Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).



    GNU grep has a -o option that prints the non-empty matches of the regular expression.



    -w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:




    aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa


    The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.






    share|improve this answer

























    • Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

      – MathEnthusiast
      Jan 15 at 7:51












    • Also - what is GNU tools ?

      – MathEnthusiast
      Jan 15 at 7:53











    • @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

      – Stéphane Chazelas
      Jan 15 at 8:01











    • @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

      – Noach
      Jan 15 at 9:58











    • @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

      – Noach
      Jan 15 at 10:01













    4












    4








    4







    With the GNU implementation of grep (or compatible):



    <your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
    while IFS= read -r guid; do
    your-action "$guid"
    sleep 5
    done


    Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).



    GNU grep has a -o option that prints the non-empty matches of the regular expression.



    -w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:




    aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa


    The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.






    share|improve this answer















    With the GNU implementation of grep (or compatible):



    <your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
    while IFS= read -r guid; do
    your-action "$guid"
    sleep 5
    done


    Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).



    GNU grep has a -o option that prints the non-empty matches of the regular expression.



    -w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:




    aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa


    The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 15 at 10:45

























    answered Jan 15 at 7:49









    Stéphane ChazelasStéphane Chazelas

    303k57570926




    303k57570926












    • Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

      – MathEnthusiast
      Jan 15 at 7:51












    • Also - what is GNU tools ?

      – MathEnthusiast
      Jan 15 at 7:53











    • @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

      – Stéphane Chazelas
      Jan 15 at 8:01











    • @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

      – Noach
      Jan 15 at 9:58











    • @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

      – Noach
      Jan 15 at 10:01

















    • Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

      – MathEnthusiast
      Jan 15 at 7:51












    • Also - what is GNU tools ?

      – MathEnthusiast
      Jan 15 at 7:53











    • @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

      – Stéphane Chazelas
      Jan 15 at 8:01











    • @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

      – Noach
      Jan 15 at 9:58











    • @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

      – Noach
      Jan 15 at 10:01
















    Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

    – MathEnthusiast
    Jan 15 at 7:51






    Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?

    – MathEnthusiast
    Jan 15 at 7:51














    Also - what is GNU tools ?

    – MathEnthusiast
    Jan 15 at 7:53





    Also - what is GNU tools ?

    – MathEnthusiast
    Jan 15 at 7:53













    @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

    – Stéphane Chazelas
    Jan 15 at 8:01





    @MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like grep here) which support extensions like that -o and < (< was in SysV grep before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o is found in many other implementations.

    – Stéphane Chazelas
    Jan 15 at 8:01













    @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

    – Noach
    Jan 15 at 9:58





    @StéphaneChazelas, how do you guard against matching cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f? (i.e. some non-guid thing that looks like two guids joined by a hyphen)

    – Noach
    Jan 15 at 9:58













    @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

    – Noach
    Jan 15 at 10:01





    @StéphaneChazelas: What edge-case are you guarding for with the IFS= read -r vs. a simple read?

    – Noach
    Jan 15 at 10:01













    2














    While I love Regular Expressions, I prefer to avoid over-specifying.
    For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:



    $ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
    while read guid ; do
    some_command "$guid"
    sleep 5
    done


    Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:



    egrep '^[0-9a-f-]36$'






    share|improve this answer





























      2














      While I love Regular Expressions, I prefer to avoid over-specifying.
      For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:



      $ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
      while read guid ; do
      some_command "$guid"
      sleep 5
      done


      Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:



      egrep '^[0-9a-f-]36$'






      share|improve this answer



























        2












        2








        2







        While I love Regular Expressions, I prefer to avoid over-specifying.
        For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:



        $ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
        while read guid ; do
        some_command "$guid"
        sleep 5
        done


        Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:



        egrep '^[0-9a-f-]36$'






        share|improve this answer















        While I love Regular Expressions, I prefer to avoid over-specifying.
        For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:



        $ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
        while read guid ; do
        some_command "$guid"
        sleep 5
        done


        Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:



        egrep '^[0-9a-f-]36$'







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 23 at 8:45

























        answered Jan 15 at 9:56









        NoachNoach

        1904




        1904



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f494546%2ffind-only-guids-in-file-bash%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?