how to compare two xml files having same data in different lines?
Clash Royale CLAN TAG#URR8PPP
up vote
8
down vote
favorite
I have two files have same data but in different lines.
File 1:
<Identities>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
File 2 :
<Identities>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
If I use diff file1 file2
command I am getting below response:
1,10d0
< <Identities>
< <Identity>
< <Id>048206031415072010Comcast.USR8JR</Id>
< <UID>ccp_test_79</UID>
< <DisplayName>JOSH CCP</DisplayName>
< <FirstName>JOSH</FirstName>
< <LastName>CCP</LastName>
< <Role>P</Role>
< <LoginStatus>C</LoginStatus>
< </Identity>
20a11,20
> <Identities>
> <Identity>
> <Id>048206031415072010Comcast.USR8JR</Id>
> <UID>ccp_test_79</UID>
> <DisplayName>JOSH CCP</DisplayName>
> <FirstName>JOSH</FirstName>
> <LastName>CCP</LastName>
> <Role>P</Role>
> <LoginStatus>C</LoginStatus>
> </Identity>
But I need to get no difference, because these files having same data in different lines.
bash shell xml file-comparison
add a comment |
up vote
8
down vote
favorite
I have two files have same data but in different lines.
File 1:
<Identities>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
File 2 :
<Identities>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
If I use diff file1 file2
command I am getting below response:
1,10d0
< <Identities>
< <Identity>
< <Id>048206031415072010Comcast.USR8JR</Id>
< <UID>ccp_test_79</UID>
< <DisplayName>JOSH CCP</DisplayName>
< <FirstName>JOSH</FirstName>
< <LastName>CCP</LastName>
< <Role>P</Role>
< <LoginStatus>C</LoginStatus>
< </Identity>
20a11,20
> <Identities>
> <Identity>
> <Id>048206031415072010Comcast.USR8JR</Id>
> <UID>ccp_test_79</UID>
> <DisplayName>JOSH CCP</DisplayName>
> <FirstName>JOSH</FirstName>
> <LastName>CCP</LastName>
> <Role>P</Role>
> <LoginStatus>C</LoginStatus>
> </Identity>
But I need to get no difference, because these files having same data in different lines.
bash shell xml file-comparison
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
2
You could tryxmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then usexmldiff
ordiff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).
– derobert
Feb 8 '13 at 19:47
add a comment |
up vote
8
down vote
favorite
up vote
8
down vote
favorite
I have two files have same data but in different lines.
File 1:
<Identities>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
File 2 :
<Identities>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
If I use diff file1 file2
command I am getting below response:
1,10d0
< <Identities>
< <Identity>
< <Id>048206031415072010Comcast.USR8JR</Id>
< <UID>ccp_test_79</UID>
< <DisplayName>JOSH CCP</DisplayName>
< <FirstName>JOSH</FirstName>
< <LastName>CCP</LastName>
< <Role>P</Role>
< <LoginStatus>C</LoginStatus>
< </Identity>
20a11,20
> <Identities>
> <Identity>
> <Id>048206031415072010Comcast.USR8JR</Id>
> <UID>ccp_test_79</UID>
> <DisplayName>JOSH CCP</DisplayName>
> <FirstName>JOSH</FirstName>
> <LastName>CCP</LastName>
> <Role>P</Role>
> <LoginStatus>C</LoginStatus>
> </Identity>
But I need to get no difference, because these files having same data in different lines.
bash shell xml file-comparison
I have two files have same data but in different lines.
File 1:
<Identities>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
File 2 :
<Identities>
<Identity>
<Id>089612381523032011Comcast.USR1JR</Id>
<UID>94701_account1</UID>
<DisplayName>account1</DisplayName>
<FirstName>account1</FirstName>
<LastName>94701</LastName>
<Role>S</Role>
<LoginStatus>C</LoginStatus>
</Identity>
<Identity>
<Id>048206031415072010Comcast.USR8JR</Id>
<UID>ccp_test_79</UID>
<DisplayName>JOSH CCP</DisplayName>
<FirstName>JOSH</FirstName>
<LastName>CCP</LastName>
<Role>P</Role>
<LoginStatus>C</LoginStatus>
</Identity>
</Identities>
If I use diff file1 file2
command I am getting below response:
1,10d0
< <Identities>
< <Identity>
< <Id>048206031415072010Comcast.USR8JR</Id>
< <UID>ccp_test_79</UID>
< <DisplayName>JOSH CCP</DisplayName>
< <FirstName>JOSH</FirstName>
< <LastName>CCP</LastName>
< <Role>P</Role>
< <LoginStatus>C</LoginStatus>
< </Identity>
20a11,20
> <Identities>
> <Identity>
> <Id>048206031415072010Comcast.USR8JR</Id>
> <UID>ccp_test_79</UID>
> <DisplayName>JOSH CCP</DisplayName>
> <FirstName>JOSH</FirstName>
> <LastName>CCP</LastName>
> <Role>P</Role>
> <LoginStatus>C</LoginStatus>
> </Identity>
But I need to get no difference, because these files having same data in different lines.
bash shell xml file-comparison
bash shell xml file-comparison
edited Nov 21 at 14:23
Rui F Ribeiro
38.3k1475126
38.3k1475126
asked Feb 8 '13 at 17:05
user32026
41112
41112
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
2
You could tryxmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then usexmldiff
ordiff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).
– derobert
Feb 8 '13 at 19:47
add a comment |
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
2
You could tryxmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then usexmldiff
ordiff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).
– derobert
Feb 8 '13 at 19:47
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
2
2
You could try
xmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then use xmldiff
or diff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).– derobert
Feb 8 '13 at 19:47
You could try
xmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then use xmldiff
or diff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).– derobert
Feb 8 '13 at 19:47
add a comment |
3 Answers
3
active
oldest
votes
up vote
6
down vote
You can achieve what you want with the help of a small Python script (you'll need Python installed, as well as the lxml
toolkit).
tagsort.py
:
#!/usr/bin/python
import sys
from lxml import etree
filename, tag = sys.argv[1:]
doc = etree.parse(filename, etree.XMLParser(remove_blank_text=True))
root = doc.getroot()
root[:] = sorted(root, key=lambda el: el.findtext(tag))
print etree.tostring(doc, pretty_print=True)
This script sorts the first-level elements under the XML document root by the content of a second-level element, sending the result to stdout. It's called like this:
$ python tagsort.py filename tag
Once you've got that, you can use process substitution to get a diff based on its output (I've added one element and changed another in your example files to show a non-empty result):
$ diff <(python tagsort.py file1 Id) <(python tagsort.py file2 Id)
4a5
> <AddedTag>Something</AddedTag>
17c18
< <Role>X</Role>
---
> <Role>S</Role>
add a comment |
up vote
3
down vote
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. The following should work for you if you are on linux, mac, or if you have windows something like cygwin installed:
$ xmllint --c14n File1.xml > 1.xml
$ xmllint --c14n File2.xml > 2.xml
$ diff 1.xml 2.xml
add a comment |
up vote
0
down vote
It's tagged shell, but honestly I prefer using a scripting language with a parser. In this case perl
with XML::Twig
.
It goes something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
sub compare_by_identity
my ( $first, $second ) = @_;
foreach my $identity ( $first->get_xpath('//Identity') )
my $id = $identity->first_child_text('Id');
print $id, "n";
my $compare_to =
$second->get_xpath( "//Identity/Id[string()="$id"]/..", 0 );
if ($compare_to)
print "Matching element found for ID $idn";
foreach my $element ( $identity->children )
my $tag = $element->tag;
my $text = $element->text;
if ( not $element->text eq $compare_to->first_child_text($tag) )
print "$id, $tag has value $text which doesn't match: ",
$compare_to->first_child_text($tag), "n";
else
print "No matching element for Id $idn";
my $first_file = XML::Twig->new->parsefile('test1.xml');
my $second_file = XML::Twig->new->parsefile('test2.xml');
compare_by_identity( $first_file, $second_file );
compare_by_identity( $second_file, $first_file );
I'm explicitly comparing one 'Identity' element at a time, and checking that all the fields in one, exist in the other, with the same value.
And then reversing that, because the second file might have extra entries.
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
You can achieve what you want with the help of a small Python script (you'll need Python installed, as well as the lxml
toolkit).
tagsort.py
:
#!/usr/bin/python
import sys
from lxml import etree
filename, tag = sys.argv[1:]
doc = etree.parse(filename, etree.XMLParser(remove_blank_text=True))
root = doc.getroot()
root[:] = sorted(root, key=lambda el: el.findtext(tag))
print etree.tostring(doc, pretty_print=True)
This script sorts the first-level elements under the XML document root by the content of a second-level element, sending the result to stdout. It's called like this:
$ python tagsort.py filename tag
Once you've got that, you can use process substitution to get a diff based on its output (I've added one element and changed another in your example files to show a non-empty result):
$ diff <(python tagsort.py file1 Id) <(python tagsort.py file2 Id)
4a5
> <AddedTag>Something</AddedTag>
17c18
< <Role>X</Role>
---
> <Role>S</Role>
add a comment |
up vote
6
down vote
You can achieve what you want with the help of a small Python script (you'll need Python installed, as well as the lxml
toolkit).
tagsort.py
:
#!/usr/bin/python
import sys
from lxml import etree
filename, tag = sys.argv[1:]
doc = etree.parse(filename, etree.XMLParser(remove_blank_text=True))
root = doc.getroot()
root[:] = sorted(root, key=lambda el: el.findtext(tag))
print etree.tostring(doc, pretty_print=True)
This script sorts the first-level elements under the XML document root by the content of a second-level element, sending the result to stdout. It's called like this:
$ python tagsort.py filename tag
Once you've got that, you can use process substitution to get a diff based on its output (I've added one element and changed another in your example files to show a non-empty result):
$ diff <(python tagsort.py file1 Id) <(python tagsort.py file2 Id)
4a5
> <AddedTag>Something</AddedTag>
17c18
< <Role>X</Role>
---
> <Role>S</Role>
add a comment |
up vote
6
down vote
up vote
6
down vote
You can achieve what you want with the help of a small Python script (you'll need Python installed, as well as the lxml
toolkit).
tagsort.py
:
#!/usr/bin/python
import sys
from lxml import etree
filename, tag = sys.argv[1:]
doc = etree.parse(filename, etree.XMLParser(remove_blank_text=True))
root = doc.getroot()
root[:] = sorted(root, key=lambda el: el.findtext(tag))
print etree.tostring(doc, pretty_print=True)
This script sorts the first-level elements under the XML document root by the content of a second-level element, sending the result to stdout. It's called like this:
$ python tagsort.py filename tag
Once you've got that, you can use process substitution to get a diff based on its output (I've added one element and changed another in your example files to show a non-empty result):
$ diff <(python tagsort.py file1 Id) <(python tagsort.py file2 Id)
4a5
> <AddedTag>Something</AddedTag>
17c18
< <Role>X</Role>
---
> <Role>S</Role>
You can achieve what you want with the help of a small Python script (you'll need Python installed, as well as the lxml
toolkit).
tagsort.py
:
#!/usr/bin/python
import sys
from lxml import etree
filename, tag = sys.argv[1:]
doc = etree.parse(filename, etree.XMLParser(remove_blank_text=True))
root = doc.getroot()
root[:] = sorted(root, key=lambda el: el.findtext(tag))
print etree.tostring(doc, pretty_print=True)
This script sorts the first-level elements under the XML document root by the content of a second-level element, sending the result to stdout. It's called like this:
$ python tagsort.py filename tag
Once you've got that, you can use process substitution to get a diff based on its output (I've added one element and changed another in your example files to show a non-empty result):
$ diff <(python tagsort.py file1 Id) <(python tagsort.py file2 Id)
4a5
> <AddedTag>Something</AddedTag>
17c18
< <Role>X</Role>
---
> <Role>S</Role>
answered Feb 14 '13 at 22:41
user27282
add a comment |
add a comment |
up vote
3
down vote
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. The following should work for you if you are on linux, mac, or if you have windows something like cygwin installed:
$ xmllint --c14n File1.xml > 1.xml
$ xmllint --c14n File2.xml > 2.xml
$ diff 1.xml 2.xml
add a comment |
up vote
3
down vote
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. The following should work for you if you are on linux, mac, or if you have windows something like cygwin installed:
$ xmllint --c14n File1.xml > 1.xml
$ xmllint --c14n File2.xml > 2.xml
$ diff 1.xml 2.xml
add a comment |
up vote
3
down vote
up vote
3
down vote
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. The following should work for you if you are on linux, mac, or if you have windows something like cygwin installed:
$ xmllint --c14n File1.xml > 1.xml
$ xmllint --c14n File2.xml > 2.xml
$ diff 1.xml 2.xml
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. The following should work for you if you are on linux, mac, or if you have windows something like cygwin installed:
$ xmllint --c14n File1.xml > 1.xml
$ xmllint --c14n File2.xml > 2.xml
$ diff 1.xml 2.xml
edited Mar 20 '17 at 10:18
Community♦
1
1
answered Dec 2 '16 at 17:30
VenomFangs
242110
242110
add a comment |
add a comment |
up vote
0
down vote
It's tagged shell, but honestly I prefer using a scripting language with a parser. In this case perl
with XML::Twig
.
It goes something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
sub compare_by_identity
my ( $first, $second ) = @_;
foreach my $identity ( $first->get_xpath('//Identity') )
my $id = $identity->first_child_text('Id');
print $id, "n";
my $compare_to =
$second->get_xpath( "//Identity/Id[string()="$id"]/..", 0 );
if ($compare_to)
print "Matching element found for ID $idn";
foreach my $element ( $identity->children )
my $tag = $element->tag;
my $text = $element->text;
if ( not $element->text eq $compare_to->first_child_text($tag) )
print "$id, $tag has value $text which doesn't match: ",
$compare_to->first_child_text($tag), "n";
else
print "No matching element for Id $idn";
my $first_file = XML::Twig->new->parsefile('test1.xml');
my $second_file = XML::Twig->new->parsefile('test2.xml');
compare_by_identity( $first_file, $second_file );
compare_by_identity( $second_file, $first_file );
I'm explicitly comparing one 'Identity' element at a time, and checking that all the fields in one, exist in the other, with the same value.
And then reversing that, because the second file might have extra entries.
add a comment |
up vote
0
down vote
It's tagged shell, but honestly I prefer using a scripting language with a parser. In this case perl
with XML::Twig
.
It goes something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
sub compare_by_identity
my ( $first, $second ) = @_;
foreach my $identity ( $first->get_xpath('//Identity') )
my $id = $identity->first_child_text('Id');
print $id, "n";
my $compare_to =
$second->get_xpath( "//Identity/Id[string()="$id"]/..", 0 );
if ($compare_to)
print "Matching element found for ID $idn";
foreach my $element ( $identity->children )
my $tag = $element->tag;
my $text = $element->text;
if ( not $element->text eq $compare_to->first_child_text($tag) )
print "$id, $tag has value $text which doesn't match: ",
$compare_to->first_child_text($tag), "n";
else
print "No matching element for Id $idn";
my $first_file = XML::Twig->new->parsefile('test1.xml');
my $second_file = XML::Twig->new->parsefile('test2.xml');
compare_by_identity( $first_file, $second_file );
compare_by_identity( $second_file, $first_file );
I'm explicitly comparing one 'Identity' element at a time, and checking that all the fields in one, exist in the other, with the same value.
And then reversing that, because the second file might have extra entries.
add a comment |
up vote
0
down vote
up vote
0
down vote
It's tagged shell, but honestly I prefer using a scripting language with a parser. In this case perl
with XML::Twig
.
It goes something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
sub compare_by_identity
my ( $first, $second ) = @_;
foreach my $identity ( $first->get_xpath('//Identity') )
my $id = $identity->first_child_text('Id');
print $id, "n";
my $compare_to =
$second->get_xpath( "//Identity/Id[string()="$id"]/..", 0 );
if ($compare_to)
print "Matching element found for ID $idn";
foreach my $element ( $identity->children )
my $tag = $element->tag;
my $text = $element->text;
if ( not $element->text eq $compare_to->first_child_text($tag) )
print "$id, $tag has value $text which doesn't match: ",
$compare_to->first_child_text($tag), "n";
else
print "No matching element for Id $idn";
my $first_file = XML::Twig->new->parsefile('test1.xml');
my $second_file = XML::Twig->new->parsefile('test2.xml');
compare_by_identity( $first_file, $second_file );
compare_by_identity( $second_file, $first_file );
I'm explicitly comparing one 'Identity' element at a time, and checking that all the fields in one, exist in the other, with the same value.
And then reversing that, because the second file might have extra entries.
It's tagged shell, but honestly I prefer using a scripting language with a parser. In this case perl
with XML::Twig
.
It goes something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
sub compare_by_identity
my ( $first, $second ) = @_;
foreach my $identity ( $first->get_xpath('//Identity') )
my $id = $identity->first_child_text('Id');
print $id, "n";
my $compare_to =
$second->get_xpath( "//Identity/Id[string()="$id"]/..", 0 );
if ($compare_to)
print "Matching element found for ID $idn";
foreach my $element ( $identity->children )
my $tag = $element->tag;
my $text = $element->text;
if ( not $element->text eq $compare_to->first_child_text($tag) )
print "$id, $tag has value $text which doesn't match: ",
$compare_to->first_child_text($tag), "n";
else
print "No matching element for Id $idn";
my $first_file = XML::Twig->new->parsefile('test1.xml');
my $second_file = XML::Twig->new->parsefile('test2.xml');
compare_by_identity( $first_file, $second_file );
compare_by_identity( $second_file, $first_file );
I'm explicitly comparing one 'Identity' element at a time, and checking that all the fields in one, exist in the other, with the same value.
And then reversing that, because the second file might have extra entries.
answered Dec 8 '16 at 11:55
Sobrique
3,759517
3,759517
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f64188%2fhow-to-compare-two-xml-files-having-same-data-in-different-lines%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
By sorting them linewise and comparing, you can check if they are not equal. Of course, equal after sorting does not mean that they are really equal as sorting destroys the XML syntax.
– jofel
Feb 8 '13 at 17:18
Don't know how to solve it. they differ by order in file1 a then b and in file2 b then a. you may expose question with diff -y -B -Z -b --strip-trailing-cr file1 file2
– Yurij73
Feb 8 '13 at 18:14
2
You could try
xmldiff
, but I think that will still notice the order changing, as order is relevant in generic XML. I think your best approach is to use an XML parser & generator to put each file in a canonical order and format, then usexmldiff
ordiff
. A job for your favorite scripting language (Perl, Ruby, Python, etc.).– derobert
Feb 8 '13 at 19:47