Extract string followed by specific word/symbol
Clash Royale CLAN TAG#URR8PPP
I have two lines as shown below in my input file input.txt and I need to extract claimStartDate from first line and claimEndDate from second line.
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00">
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
rm input.txt
awk '/<ProfessionalClaim/' test.xml | head -1 > input.txt
awk '/<ProfessionalClaim/' test.xml | tail -1 >> input.txt
awk 'match($0, "claimStartDate="([^"]+)"", start); print start[1]
match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
shell-script awk ksh xml
|
show 1 more comment
I have two lines as shown below in my input file input.txt and I need to extract claimStartDate from first line and claimEndDate from second line.
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00">
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
rm input.txt
awk '/<ProfessionalClaim/' test.xml | head -1 > input.txt
awk '/<ProfessionalClaim/' test.xml | tail -1 >> input.txt
awk 'match($0, "claimStartDate="([^"]+)"", start); print start[1]
match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
shell-script awk ksh xml
Question needs to be completed.
– cagdas
Jan 24 at 7:22
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
2
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41
|
show 1 more comment
I have two lines as shown below in my input file input.txt and I need to extract claimStartDate from first line and claimEndDate from second line.
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00">
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
rm input.txt
awk '/<ProfessionalClaim/' test.xml | head -1 > input.txt
awk '/<ProfessionalClaim/' test.xml | tail -1 >> input.txt
awk 'match($0, "claimStartDate="([^"]+)"", start); print start[1]
match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
shell-script awk ksh xml
I have two lines as shown below in my input file input.txt and I need to extract claimStartDate from first line and claimEndDate from second line.
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00">
<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
rm input.txt
awk '/<ProfessionalClaim/' test.xml | head -1 > input.txt
awk '/<ProfessionalClaim/' test.xml | tail -1 >> input.txt
awk 'match($0, "claimStartDate="([^"]+)"", start); print start[1]
match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
shell-script awk ksh xml
shell-script awk ksh xml
edited Jan 24 at 16:51
Velava Shanmugam
asked Jan 24 at 7:17
Velava ShanmugamVelava Shanmugam
36
36
Question needs to be completed.
– cagdas
Jan 24 at 7:22
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
2
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41
|
show 1 more comment
Question needs to be completed.
– cagdas
Jan 24 at 7:22
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
2
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41
Question needs to be completed.
– cagdas
Jan 24 at 7:22
Question needs to be completed.
– cagdas
Jan 24 at 7:22
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
2
2
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41
|
show 1 more comment
1 Answer
1
active
oldest
votes
$ awk '/F_LINE/ match($0, "claimStartDate="([^"]+)"", start); print start[1]
/L_LINE/ match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
EDIT due to your new information:
$ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
You can also do this all in one run:
$ grep "<ProfessionalClaim" text.xml
| sed -n '1p;$p'
| $ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]'
grep
find all line with<ProfessionalClaim
intext.xml
sed
truncate the lines to the first and the last onylawk
will print theclaimStartDate
for the first line andClaimEndDate
for the second line
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file calledinput.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)
– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
|
show 1 more comment
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f496392%2fextract-string-followed-by-specific-word-symbol%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$ awk '/F_LINE/ match($0, "claimStartDate="([^"]+)"", start); print start[1]
/L_LINE/ match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
EDIT due to your new information:
$ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
You can also do this all in one run:
$ grep "<ProfessionalClaim" text.xml
| sed -n '1p;$p'
| $ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]'
grep
find all line with<ProfessionalClaim
intext.xml
sed
truncate the lines to the first and the last onylawk
will print theclaimStartDate
for the first line andClaimEndDate
for the second line
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file calledinput.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)
– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
|
show 1 more comment
$ awk '/F_LINE/ match($0, "claimStartDate="([^"]+)"", start); print start[1]
/L_LINE/ match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
EDIT due to your new information:
$ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
You can also do this all in one run:
$ grep "<ProfessionalClaim" text.xml
| sed -n '1p;$p'
| $ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]'
grep
find all line with<ProfessionalClaim
intext.xml
sed
truncate the lines to the first and the last onylawk
will print theclaimStartDate
for the first line andClaimEndDate
for the second line
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file calledinput.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)
– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
|
show 1 more comment
$ awk '/F_LINE/ match($0, "claimStartDate="([^"]+)"", start); print start[1]
/L_LINE/ match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
EDIT due to your new information:
$ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
You can also do this all in one run:
$ grep "<ProfessionalClaim" text.xml
| sed -n '1p;$p'
| $ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]'
grep
find all line with<ProfessionalClaim
intext.xml
sed
truncate the lines to the first and the last onylawk
will print theclaimStartDate
for the first line andClaimEndDate
for the second line
$ awk '/F_LINE/ match($0, "claimStartDate="([^"]+)"", start); print start[1]
/L_LINE/ match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
EDIT due to your new information:
$ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]' input.txt
2018-04-02
2018-04-17
You can also do this all in one run:
$ grep "<ProfessionalClaim" text.xml
| sed -n '1p;$p'
| $ awk 'NR==1 match($0, "claimStartDate="([^"]+)"", start); print start[1]
NR==2 match($0, "claimEndDate="([^"]+)"", end); print end[1]'
grep
find all line with<ProfessionalClaim
intext.xml
sed
truncate the lines to the first and the last onylawk
will print theclaimStartDate
for the first line andClaimEndDate
for the second line
edited Jan 24 at 17:59
answered Jan 24 at 7:45
finswimmerfinswimmer
52416
52416
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file calledinput.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)
– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
|
show 1 more comment
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file calledinput.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)
– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
my inputs are in two string variable F_LINE & L_LINE. what is this input.txt here?
– Velava Shanmugam
Jan 24 at 8:34
As you hasn't specify how you pulled the two lines I assumed they are in a new file called
input.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)– finswimmer
Jan 24 at 8:45
As you hasn't specify how you pulled the two lines I assumed they are in a new file called
input.txt
in my example. If this is not the case, provide more information in your original post, how you've extracted them and from where you start now. (show some code, what language are you using, ...)– finswimmer
Jan 24 at 8:45
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
Earlier I was writing those two lines in to separate variable each called F_LINE and L_LINE <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 16:25
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
I need only the claimStartDate from first line and claimEndDate from second line.
– Velava Shanmugam
Jan 24 at 16:34
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
Thanks a lot it s working fine! Also need to take one other field from first and last line.(ClaimProcessedDateTime). I am using the below one for that, but for some reason the paid_stop not getting populated. grep "<ProfessionalClaim" test.xml | sed -n '1p;$p' |awk 'NR==1 match($0, "claimProcessedDateTime="([^"]+)"", start); print "paid_start " start[1] NR==2 match($0, "ClaimProcessedDateTime="([^"]+)"", end); print "paid_stop " end[1]'
– Velava Shanmugam
Jan 24 at 19:05
|
show 1 more comment
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f496392%2fextract-string-followed-by-specific-word-symbol%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Question needs to be completed.
– cagdas
Jan 24 at 7:22
F_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"> L_LINE=<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">
– Velava Shanmugam
Jan 24 at 7:23
These lines are in a text file you want to use as the input? Are there multiple F_LINE and L_LINE? How should your output look like? Please edit your question and add these information. Use the code button to present file contents and commands better. Thanks!
– finswimmer
Jan 24 at 7:35
I have pulled these two lines from XML file and use this as input to pull the claimStartDate from F_LINE & claimEndDate from L_LINE. I have changed the question now. Please let me know if need anymore details. thanks!
– Velava Shanmugam
Jan 24 at 7:38
2
It would be appropriate and more efficient to use an XML parser (like XMLStarlet or a Perl/Python XML parser module) on the original XML document. You have not shown how these lines are part of the original document or how you parse them out.
– Kusalananda
Jan 24 at 7:41