How to extract data between two different xml tags

up vote
0
down vote

favorite

I have looked but haven't been able to find anyone else with the same sort of problem I have.

I have an xml file like this:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

Basically a whole bunch of data all on one line, no line breaks.
I need to extract the info (preferably just as-is with tags intact) between a specific < ID> tag (eg < ID>2 )and the very next < /dateAccessed> tag. I have about 50 files to check for a particular ID and the following related data. I get that this is not standard, there is no nesting.

I originally tried to do this using grep and sed, but I just get the whole file returned, which seems odd to me. Can't I just treat this like a text file?

EDIT:

I didn't realise the formatter removed text that was in enclosing < and > , so after re-reading my question this morning, I realised it's asking something completely different.
TL;DR
I need what is between a specific value between ID tags and the next closing DateAccessed tag. Not between the same opening and closing tags, ie between ID and /ID

So I can get something like this result:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

bumped to the homepage by Communityâ™¦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

2

I can't help but feel this is the "wrong question". If you're working with XML files you should really be using an XML parser (such as xmlstarlet). I appreciate this won't give you an unbalanced segment, and so is not a suitable answer to your question as asked. But trying to treat XML as text will almost certainly lead to unintended consequences down the road. It's not a good place to be. Really.
â€“Â roaima
Oct 18 '17 at 7:14

The data is not well formed XML. It's lacking a root node.
â€“Â Kusalananda
Jul 11 at 20:47

add a commentÂ |Â

up vote
0
down vote

favorite

I have looked but haven't been able to find anyone else with the same sort of problem I have.

I have an xml file like this:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

I originally tried to do this using grep and sed, but I just get the whole file returned, which seems odd to me. Can't I just treat this like a text file?

EDIT:

So I can get something like this result:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

bumped to the homepage by Communityâ™¦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

2

I can't help but feel this is the "wrong question". If you're working with XML files you should really be using an XML parser (such as xmlstarlet). I appreciate this won't give you an unbalanced segment, and so is not a suitable answer to your question as asked. But trying to treat XML as text will almost certainly lead to unintended consequences down the road. It's not a good place to be. Really.
â€“Â roaima
Oct 18 '17 at 7:14

The data is not well formed XML. It's lacking a root node.
â€“Â Kusalananda
Jul 11 at 20:47

add a commentÂ |Â

up vote
0
down vote

favorite

I have looked but haven't been able to find anyone else with the same sort of problem I have.

I have an xml file like this:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

I originally tried to do this using grep and sed, but I just get the whole file returned, which seems odd to me. Can't I just treat this like a text file?

EDIT:

So I can get something like this result:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

I have looked but haven't been able to find anyone else with the same sort of problem I have.

I have an xml file like this:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

I originally tried to do this using grep and sed, but I just get the whole file returned, which seems odd to me. Can't I just treat this like a text file?

EDIT:

So I can get something like this result:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

text-processing xml

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

edited Feb 27 '17 at 23:36

asked Feb 27 '17 at 1:37

averagescripter

1124

asked Feb 27 '17 at 1:37

averagescripter

1124

asked Feb 27 '17 at 1:37

averagescripter

1124

bumped to the homepage by Communityâ™¦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Communityâ™¦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

2

I can't help but feel this is the "wrong question". If you're working with XML files you should really be using an XML parser (such as xmlstarlet). I appreciate this won't give you an unbalanced segment, and so is not a suitable answer to your question as asked. But trying to treat XML as text will almost certainly lead to unintended consequences down the road. It's not a good place to be. Really.
â€“Â roaima
Oct 18 '17 at 7:14

The data is not well formed XML. It's lacking a root node.
â€“Â Kusalananda
Jul 11 at 20:47

add a commentÂ |Â

2

I can't help but feel this is the "wrong question". If you're working with XML files you should really be using an XML parser (such as xmlstarlet). I appreciate this won't give you an unbalanced segment, and so is not a suitable answer to your question as asked. But trying to treat XML as text will almost certainly lead to unintended consequences down the road. It's not a good place to be. Really.
â€“Â roaima
Oct 18 '17 at 7:14

The data is not well formed XML. It's lacking a root node.
â€“Â Kusalananda
Jul 11 at 20:47

I can't help but feel this is the "wrong question". If you're working with XML files you should really be using an XML parser (such as xmlstarlet). I appreciate this won't give you an unbalanced segment, and so is not a suitable answer to your question as asked. But trying to treat XML as text will almost certainly lead to unintended consequences down the road. It's not a good place to be. Really.
â€“Â roaima
Oct 18 '17 at 7:14

The data is not well formed XML. It's lacking a root node.
â€“Â Kusalananda
Jul 11 at 20:47

add a commentÂ |Â

4 Answers
4

active

oldest

votes

up vote
0
down vote

Grep

grep -oE '<data>[^<]*</data>' yourxmlfile

Bash

tag='data'
tL="<$tag>" tR="</$tag>"
xml=$(< yourxmlfile)
while case $xml in *"$tL"* ) :;; * ) break;; esac; do
 t1=$xml#*"$tL" t2=$t1%%"$tR"* xml=$t1#*"$tR"
 echo "$tL$t2$tR"
done

Perl

perl -lne "print for/<$tag>.*?</$tag>/g" yourxmlfile

Sed

sed -e "
 s|<$tag>|n&|
 s/.*n//
 s|</$tag>|&n|
 /n/P;D
" yourxmlfile

Output

 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

add a commentÂ |Â

up vote
0
down vote

As noted in the comments, your data isn't well-formed XML and it isn't completely clear what the structure of your document is, e.g. judging by your example data, it looks like you have no nested elements - is that really the case?

With that caveat in mind, here's a Python script that uses the BeautifulSoup4 parsing library to do what you want (i.e. it produces the desired output data for the given example input data):

#!/usr/bin/env python
# coding: ascii
"""extract.py

Extract everything between two XML tags
in a (possibly poorly formed) XML document."""

from bs4 import BeautifulSoup
import sys

# Set the opening tag name and value
opening_name = "ID"
opening_text = "2"

# Set the closing tag name
closing_name = "dateAccessed"

# Get the XML data from a file and instantiate a BeautifulSoup parser
# We add a root node because the input data is missing a root
with open(sys.argv[1], 'r') as xmlfile:
 xmldoc = "<root>" + xmlfile.read() + "</root>"
 soup = BeautifulSoup(xmldoc, 'xml')

# Iterate through the elements of the XML data and collect
# all of the elements inbetween the opening and closing tags
elements = 
match = False
for e in soup.find_all():
 if match is True:
 elements.append(str(e))
 if e.name==closing_name:
 break
 else:
 try:
 if e.name==opening_name and e.text==opening_text:
 match = True
 elements.append(str(e))
 except AttributeError:
 pass

# Output the results on a single line
print("".join(elements))

You would run it something like this:

python extract.py data.xml

For your given example data:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

It produces the following output:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

answered Jul 11 at 22:23

igal

4,992930

add a commentÂ |Â

up vote
-1
down vote

if you want to extract the ID value, and i assume ID always comes as first tag, then you can use this

awk -F"[<>]" 'print $3' input.txt

if you want to search for specific tag, then try this awk command. you need to change the value of input=ID

awk -F"[<>]" 'for(i=1;i<=NF;i++)if($i~input)print $(i+1);next' input=ID input.txt

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

add a commentÂ |Â

up vote
-3
down vote

provided XML has no line breaks.
why don't you try inserting n between >< which will make the XML in standard format

Example:-
i have created a file called stack with the given xml.

below is the sed operation to introduce line breaks.

 cat stack|sed -e 's/></>n</g'

<ID>2</ID>
<data>asdf</data>
<data2>asdf</data2>
<dataX>asdf</dataX>
<dateAccessed>somedate</dateAccessed>

now you can access the tags you want

answered Oct 18 '17 at 7:08

user256118

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f347776%2fhow-to-extract-data-between-two-different-xml-tags%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
0
down vote

Grep

grep -oE '<data>[^<]*</data>' yourxmlfile

Bash

tag='data'
tL="<$tag>" tR="</$tag>"
xml=$(< yourxmlfile)
while case $xml in *"$tL"* ) :;; * ) break;; esac; do
 t1=$xml#*"$tL" t2=$t1%%"$tR"* xml=$t1#*"$tR"
 echo "$tL$t2$tR"
done

Perl

perl -lne "print for/<$tag>.*?</$tag>/g" yourxmlfile

Sed

sed -e "
 s|<$tag>|n&|
 s/.*n//
 s|</$tag>|&n|
 /n/P;D
" yourxmlfile

Output

 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

add a commentÂ |Â

up vote
0
down vote

Grep

grep -oE '<data>[^<]*</data>' yourxmlfile

Bash

tag='data'
tL="<$tag>" tR="</$tag>"
xml=$(< yourxmlfile)
while case $xml in *"$tL"* ) :;; * ) break;; esac; do
 t1=$xml#*"$tL" t2=$t1%%"$tR"* xml=$t1#*"$tR"
 echo "$tL$t2$tR"
done

Perl

perl -lne "print for/<$tag>.*?</$tag>/g" yourxmlfile

Sed

sed -e "
 s|<$tag>|n&|
 s/.*n//
 s|</$tag>|&n|
 /n/P;D
" yourxmlfile

Output

 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

add a commentÂ |Â

up vote
0
down vote

Grep

grep -oE '<data>[^<]*</data>' yourxmlfile

Bash

tag='data'
tL="<$tag>" tR="</$tag>"
xml=$(< yourxmlfile)
while case $xml in *"$tL"* ) :;; * ) break;; esac; do
 t1=$xml#*"$tL" t2=$t1%%"$tR"* xml=$t1#*"$tR"
 echo "$tL$t2$tR"
done

Perl

perl -lne "print for/<$tag>.*?</$tag>/g" yourxmlfile

Sed

sed -e "
 s|<$tag>|n&|
 s/.*n//
 s|</$tag>|&n|
 /n/P;D
" yourxmlfile

Output

 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

Grep

grep -oE '<data>[^<]*</data>' yourxmlfile

Bash

tag='data'
tL="<$tag>" tR="</$tag>"
xml=$(< yourxmlfile)
while case $xml in *"$tL"* ) :;; * ) break;; esac; do
 t1=$xml#*"$tL" t2=$t1%%"$tR"* xml=$t1#*"$tR"
 echo "$tL$t2$tR"
done

Perl

perl -lne "print for/<$tag>.*?</$tag>/g" yourxmlfile

Sed

sed -e "
 s|<$tag>|n&|
 s/.*n//
 s|</$tag>|&n|
 /n/P;D
" yourxmlfile

Output

 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>
 <data>asdf</data>

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

edited Feb 27 '17 at 4:33

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

answered Feb 27 '17 at 3:48

Rakesh Sharma

62213

add a commentÂ |Â

up vote
0
down vote

With that caveat in mind, here's a Python script that uses the BeautifulSoup4 parsing library to do what you want (i.e. it produces the desired output data for the given example input data):

#!/usr/bin/env python
# coding: ascii
"""extract.py

Extract everything between two XML tags
in a (possibly poorly formed) XML document."""

from bs4 import BeautifulSoup
import sys

# Set the opening tag name and value
opening_name = "ID"
opening_text = "2"

# Set the closing tag name
closing_name = "dateAccessed"

# Get the XML data from a file and instantiate a BeautifulSoup parser
# We add a root node because the input data is missing a root
with open(sys.argv[1], 'r') as xmlfile:
 xmldoc = "<root>" + xmlfile.read() + "</root>"
 soup = BeautifulSoup(xmldoc, 'xml')

# Iterate through the elements of the XML data and collect
# all of the elements inbetween the opening and closing tags
elements = 
match = False
for e in soup.find_all():
 if match is True:
 elements.append(str(e))
 if e.name==closing_name:
 break
 else:
 try:
 if e.name==opening_name and e.text==opening_text:
 match = True
 elements.append(str(e))
 except AttributeError:
 pass

# Output the results on a single line
print("".join(elements))

You would run it something like this:

python extract.py data.xml

For your given example data:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

It produces the following output:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

answered Jul 11 at 22:23

igal

4,992930

add a commentÂ |Â

up vote
0
down vote

With that caveat in mind, here's a Python script that uses the BeautifulSoup4 parsing library to do what you want (i.e. it produces the desired output data for the given example input data):

#!/usr/bin/env python
# coding: ascii
"""extract.py

Extract everything between two XML tags
in a (possibly poorly formed) XML document."""

from bs4 import BeautifulSoup
import sys

# Set the opening tag name and value
opening_name = "ID"
opening_text = "2"

# Set the closing tag name
closing_name = "dateAccessed"

# Get the XML data from a file and instantiate a BeautifulSoup parser
# We add a root node because the input data is missing a root
with open(sys.argv[1], 'r') as xmlfile:
 xmldoc = "<root>" + xmlfile.read() + "</root>"
 soup = BeautifulSoup(xmldoc, 'xml')

# Iterate through the elements of the XML data and collect
# all of the elements inbetween the opening and closing tags
elements = 
match = False
for e in soup.find_all():
 if match is True:
 elements.append(str(e))
 if e.name==closing_name:
 break
 else:
 try:
 if e.name==opening_name and e.text==opening_text:
 match = True
 elements.append(str(e))
 except AttributeError:
 pass

# Output the results on a single line
print("".join(elements))

You would run it something like this:

python extract.py data.xml

For your given example data:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

It produces the following output:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

answered Jul 11 at 22:23

igal

4,992930

add a commentÂ |Â

up vote
0
down vote

With that caveat in mind, here's a Python script that uses the BeautifulSoup4 parsing library to do what you want (i.e. it produces the desired output data for the given example input data):

#!/usr/bin/env python
# coding: ascii
"""extract.py

Extract everything between two XML tags
in a (possibly poorly formed) XML document."""

from bs4 import BeautifulSoup
import sys

# Set the opening tag name and value
opening_name = "ID"
opening_text = "2"

# Set the closing tag name
closing_name = "dateAccessed"

# Get the XML data from a file and instantiate a BeautifulSoup parser
# We add a root node because the input data is missing a root
with open(sys.argv[1], 'r') as xmlfile:
 xmldoc = "<root>" + xmlfile.read() + "</root>"
 soup = BeautifulSoup(xmldoc, 'xml')

# Iterate through the elements of the XML data and collect
# all of the elements inbetween the opening and closing tags
elements = 
match = False
for e in soup.find_all():
 if match is True:
 elements.append(str(e))
 if e.name==closing_name:
 break
 else:
 try:
 if e.name==opening_name and e.text==opening_text:
 match = True
 elements.append(str(e))
 except AttributeError:
 pass

# Output the results on a single line
print("".join(elements))

You would run it something like this:

python extract.py data.xml

For your given example data:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

It produces the following output:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

answered Jul 11 at 22:23

igal

4,992930

With that caveat in mind, here's a Python script that uses the BeautifulSoup4 parsing library to do what you want (i.e. it produces the desired output data for the given example input data):

#!/usr/bin/env python
# coding: ascii
"""extract.py

Extract everything between two XML tags
in a (possibly poorly formed) XML document."""

from bs4 import BeautifulSoup
import sys

# Set the opening tag name and value
opening_name = "ID"
opening_text = "2"

# Set the closing tag name
closing_name = "dateAccessed"

# Get the XML data from a file and instantiate a BeautifulSoup parser
# We add a root node because the input data is missing a root
with open(sys.argv[1], 'r') as xmlfile:
 xmldoc = "<root>" + xmlfile.read() + "</root>"
 soup = BeautifulSoup(xmldoc, 'xml')

# Iterate through the elements of the XML data and collect
# all of the elements inbetween the opening and closing tags
elements = 
match = False
for e in soup.find_all():
 if match is True:
 elements.append(str(e))
 if e.name==closing_name:
 break
 else:
 try:
 if e.name==opening_name and e.text==opening_text:
 match = True
 elements.append(str(e))
 except AttributeError:
 pass

# Output the results on a single line
print("".join(elements))

You would run it something like this:

python extract.py data.xml

For your given example data:

<ID>1</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>3</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed><ID>4</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

It produces the following output:

<ID>2</ID><data>asdf</data><data2>asdf</data2><dataX>asdf</dataX><dateAccessed>somedate</dateAccessed>

answered Jul 11 at 22:23

igal

4,992930

answered Jul 11 at 22:23

igal

4,992930

answered Jul 11 at 22:23

igal

4,992930

answered Jul 11 at 22:23

igal

4,992930

add a commentÂ |Â

up vote
-1
down vote

if you want to extract the ID value, and i assume ID always comes as first tag, then you can use this

awk -F"[<>]" 'print $3' input.txt

if you want to search for specific tag, then try this awk command. you need to change the value of input=ID

awk -F"[<>]" 'for(i=1;i<=NF;i++)if($i~input)print $(i+1);next' input=ID input.txt

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

add a commentÂ |Â

up vote
-1
down vote

if you want to extract the ID value, and i assume ID always comes as first tag, then you can use this

awk -F"[<>]" 'print $3' input.txt

if you want to search for specific tag, then try this awk command. you need to change the value of input=ID

awk -F"[<>]" 'for(i=1;i<=NF;i++)if($i~input)print $(i+1);next' input=ID input.txt

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

add a commentÂ |Â

up vote
-1
down vote

if you want to extract the ID value, and i assume ID always comes as first tag, then you can use this

awk -F"[<>]" 'print $3' input.txt

if you want to search for specific tag, then try this awk command. you need to change the value of input=ID

awk -F"[<>]" 'for(i=1;i<=NF;i++)if($i~input)print $(i+1);next' input=ID input.txt

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

if you want to extract the ID value, and i assume ID always comes as first tag, then you can use this

awk -F"[<>]" 'print $3' input.txt

if you want to search for specific tag, then try this awk command. you need to change the value of input=ID

awk -F"[<>]" 'for(i=1;i<=NF;i++)if($i~input)print $(i+1);next' input=ID input.txt

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

answered Feb 27 '17 at 3:32

Kamaraj

2,9081513

add a commentÂ |Â

up vote
-3
down vote

provided XML has no line breaks.
why don't you try inserting n between >< which will make the XML in standard format

Example:-
i have created a file called stack with the given xml.

below is the sed operation to introduce line breaks.

 cat stack|sed -e 's/></>n</g'

<ID>2</ID>
<data>asdf</data>
<data2>asdf</data2>
<dataX>asdf</dataX>
<dateAccessed>somedate</dateAccessed>

now you can access the tags you want

answered Oct 18 '17 at 7:08

user256118

add a commentÂ |Â

up vote
-3
down vote

provided XML has no line breaks.
why don't you try inserting n between >< which will make the XML in standard format

Example:-
i have created a file called stack with the given xml.

below is the sed operation to introduce line breaks.

 cat stack|sed -e 's/></>n</g'

<ID>2</ID>
<data>asdf</data>
<data2>asdf</data2>
<dataX>asdf</dataX>
<dateAccessed>somedate</dateAccessed>

now you can access the tags you want

answered Oct 18 '17 at 7:08

user256118

add a commentÂ |Â

up vote
-3
down vote

provided XML has no line breaks.
why don't you try inserting n between >< which will make the XML in standard format

Example:-
i have created a file called stack with the given xml.

below is the sed operation to introduce line breaks.

 cat stack|sed -e 's/></>n</g'

<ID>2</ID>
<data>asdf</data>
<data2>asdf</data2>
<dataX>asdf</dataX>
<dateAccessed>somedate</dateAccessed>

now you can access the tags you want

answered Oct 18 '17 at 7:08

user256118

provided XML has no line breaks.
why don't you try inserting n between >< which will make the XML in standard format

Example:-
i have created a file called stack with the given xml.

below is the sed operation to introduce line breaks.

 cat stack|sed -e 's/></>n</g'

<ID>2</ID>
<data>asdf</data>
<data2>asdf</data2>
<dataX>asdf</dataX>
<dateAccessed>somedate</dateAccessed>

now you can access the tags you want

answered Oct 18 '17 at 7:08

user256118

answered Oct 18 '17 at 7:08

user256118

answered Oct 18 '17 at 7:08

user256118

answered Oct 18 '17 at 7:08

user256118

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

How to extract data between two different xml tags

bumped to the homepage by Communityâ™¦ 4 mins ago

bumped to the homepage by Communityâ™¦ 4 mins ago

bumped to the homepage by Communityâ™¦ 4 mins ago

bumped to the homepage by Communityâ™¦ 4 mins ago

4 Answers 4

Grep

Bash

Perl

Sed

Output

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Grep

Bash

Perl

Sed

Output

Grep

Bash

Perl

Sed

Output

Grep

Bash

Perl

Sed

Output

Grep

Bash

Perl

Sed

Output

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

4 Answers
4

4 Answers
4

4 Answers
4