Find only GUIDs in file - Bash
Clash Royale CLAN TAG#URR8PPP
I have a file that might contain GUIDs (their canonical textual representation).
I want to do an action for each GUID in the file. It might contain any number of GUIDs.
I have already a file ready for reading. How do I spot the GUIDS?
I know I need to use while read FILENAME
An example of my file :
GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS
bash shell-script scripting wildcards
|
show 2 more comments
I have a file that might contain GUIDs (their canonical textual representation).
I want to do an action for each GUID in the file. It might contain any number of GUIDs.
I have already a file ready for reading. How do I spot the GUIDS?
I know I need to use while read FILENAME
An example of my file :
GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS
bash shell-script scripting wildcards
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50
|
show 2 more comments
I have a file that might contain GUIDs (their canonical textual representation).
I want to do an action for each GUID in the file. It might contain any number of GUIDs.
I have already a file ready for reading. How do I spot the GUIDS?
I know I need to use while read FILENAME
An example of my file :
GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS
bash shell-script scripting wildcards
I have a file that might contain GUIDs (their canonical textual representation).
I want to do an action for each GUID in the file. It might contain any number of GUIDs.
I have already a file ready for reading. How do I spot the GUIDS?
I know I need to use while read FILENAME
An example of my file :
GUIDs
--------------------------------------
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b
91d523b0-4926-456e-a9d2-ade713f5b07f
(2 rows)
// THERE IS AN EMPTY LINE HERE AFTER NUMBER OF ROWS
bash shell-script scripting wildcards
bash shell-script scripting wildcards
edited Jan 15 at 8:04
Stéphane Chazelas
303k57570926
303k57570926
asked Jan 15 at 7:41
MathEnthusiastMathEnthusiast
233
233
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50
|
show 2 more comments
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50
|
show 2 more comments
2 Answers
2
active
oldest
votes
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (likegrep
here) which support extensions like that-o
and<
(<
was in SysVgrep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular,-o
is found in many other implementations.
– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matchingcf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)
– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with theIFS= read -r
vs. a simpleread
?
– Noach
Jan 15 at 10:01
|
show 2 more comments
While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]36$'
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f494546%2ffind-only-guids-in-file-bash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (likegrep
here) which support extensions like that-o
and<
(<
was in SysVgrep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular,-o
is found in many other implementations.
– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matchingcf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)
– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with theIFS= read -r
vs. a simpleread
?
– Noach
Jan 15 at 10:01
|
show 2 more comments
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (likegrep
here) which support extensions like that-o
and<
(<
was in SysVgrep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular,-o
is found in many other implementations.
– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matchingcf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)
– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with theIFS= read -r
vs. a simpleread
?
– Noach
Jan 15 at 10:01
|
show 2 more comments
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
With the GNU implementation of grep
(or compatible):
<your-file grep -Ewo '[[:xdigit:]]8(-[[:xdigit:]]4)3-[[:xdigit:]]12' |
while IFS= read -r guid; do
your-action "$guid"
sleep 5
done
Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).
GNU grep
has a -o
option that prints the non-empty matches of the regular expression.
-w
is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:
aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa
The rest is standard POSIX syntax. Note that [[:xdigit:]]
matches on ABCDEF as well. You can replace it with [0123456789abcdef]
if you want to match only lower case GUIDs.
edited Jan 15 at 10:45
answered Jan 15 at 7:49
Stéphane ChazelasStéphane Chazelas
303k57570926
303k57570926
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (likegrep
here) which support extensions like that-o
and<
(<
was in SysVgrep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular,-o
is found in many other implementations.
– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matchingcf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)
– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with theIFS= read -r
vs. a simpleread
?
– Noach
Jan 15 at 10:01
|
show 2 more comments
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (likegrep
here) which support extensions like that-o
and<
(<
was in SysVgrep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular,-o
is found in many other implementations.
– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matchingcf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)
– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with theIFS= read -r
vs. a simpleread
?
– Noach
Jan 15 at 10:01
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Can you please explain? What is that "<" in the beginning ? Also - what is GNU tools ? Can we assume my file name is GUIDS.TXT ?
– MathEnthusiast
Jan 15 at 7:51
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
Also - what is GNU tools ?
– MathEnthusiast
Jan 15 at 7:53
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like
grep
here) which support extensions like that -o
and <
(<
was in SysV grep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o
is found in many other implementations.– Stéphane Chazelas
Jan 15 at 8:01
@MathEnthusiast, see edit. The GNU project is an effort by the Free Software Foundation to provide with a FLOSS reimplementation of Unix. Some people confuse it with Linux as GNU systems generally use Linux as their kernel. They have written extended versions of the Unix utilities (like
grep
here) which support extensions like that -o
and <
(<
was in SysV grep
before GNU's). GNU utilities are now more common than the original versions, and many other non-GNU implementations have copied some of the GNU extensions. In particular, -o
is found in many other implementations.– Stéphane Chazelas
Jan 15 at 8:01
@StéphaneChazelas, how do you guard against matching
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)– Noach
Jan 15 at 9:58
@StéphaneChazelas, how do you guard against matching
cf6e328c-c918-4d2f-80d3-71ecaf09bf7b-91d523b0-4926-456e-a9d2-ade713f5b07f
? (i.e. some non-guid thing that looks like two guids joined by a hyphen)– Noach
Jan 15 at 9:58
@StéphaneChazelas: What edge-case are you guarding for with the
IFS= read -r
vs. a simple read
?– Noach
Jan 15 at 10:01
@StéphaneChazelas: What edge-case are you guarding for with the
IFS= read -r
vs. a simple read
?– Noach
Jan 15 at 10:01
|
show 2 more comments
While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]36$'
add a comment |
While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]36$'
add a comment |
While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]36$'
While I love Regular Expressions, I prefer to avoid over-specifying.
For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:
$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
while read guid ; do
some_command "$guid"
sleep 5
done
Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:
egrep '^[0-9a-f-]36$'
edited Jan 23 at 8:45
answered Jan 15 at 9:56
NoachNoach
1904
1904
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f494546%2ffind-only-guids-in-file-bash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Post your sample file.
– Tuyen Pham
Jan 15 at 7:44
You're looking for any digit(s) from 0 to 10k, in any format? Or what exactly
– Xen2050
Jan 15 at 7:46
I wrote a file as example
– MathEnthusiast
Jan 15 at 7:47
What's the action you want to perform? It alters the possible solution
– roaima
Jan 15 at 7:49
I need to run a command and then wait 5 seconds
– MathEnthusiast
Jan 15 at 7:50