script to translate xml to json

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








-3















I have 5000 questions in txt file like this:



<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>


I want to write a script in Ubuntu to convert all questions like this:



 
"text":"The question her",
"answer1":"text",
"answer2":"text",
"answer3":"text",
"answer4":"text"
,









share|improve this question
























  • that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

    – dgan
    Mar 7 at 10:11











  • Is this XML to JSON?

    – Arkadiusz Drabczyk
    Mar 7 at 10:13












  • yes it is - Ark

    – Silver
    Mar 7 at 10:14











  • What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

    – Stéphane Chazelas
    Mar 7 at 10:14











  • encoding="UTF-8" - all Questions yes - Stephane.

    – Silver
    Mar 7 at 10:16


















-3















I have 5000 questions in txt file like this:



<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>


I want to write a script in Ubuntu to convert all questions like this:



 
"text":"The question her",
"answer1":"text",
"answer2":"text",
"answer3":"text",
"answer4":"text"
,









share|improve this question
























  • that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

    – dgan
    Mar 7 at 10:11











  • Is this XML to JSON?

    – Arkadiusz Drabczyk
    Mar 7 at 10:13












  • yes it is - Ark

    – Silver
    Mar 7 at 10:14











  • What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

    – Stéphane Chazelas
    Mar 7 at 10:14











  • encoding="UTF-8" - all Questions yes - Stephane.

    – Silver
    Mar 7 at 10:16














-3












-3








-3








I have 5000 questions in txt file like this:



<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>


I want to write a script in Ubuntu to convert all questions like this:



 
"text":"The question her",
"answer1":"text",
"answer2":"text",
"answer3":"text",
"answer4":"text"
,









share|improve this question
















I have 5000 questions in txt file like this:



<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>


I want to write a script in Ubuntu to convert all questions like this:



 
"text":"The question her",
"answer1":"text",
"answer2":"text",
"answer3":"text",
"answer4":"text"
,






ubuntu text-processing scripting xml json






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 10:28









Stéphane Chazelas

313k57592948




313k57592948










asked Mar 7 at 10:09









SilverSilver

1




1












  • that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

    – dgan
    Mar 7 at 10:11











  • Is this XML to JSON?

    – Arkadiusz Drabczyk
    Mar 7 at 10:13












  • yes it is - Ark

    – Silver
    Mar 7 at 10:14











  • What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

    – Stéphane Chazelas
    Mar 7 at 10:14











  • encoding="UTF-8" - all Questions yes - Stephane.

    – Silver
    Mar 7 at 10:16


















  • that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

    – dgan
    Mar 7 at 10:11











  • Is this XML to JSON?

    – Arkadiusz Drabczyk
    Mar 7 at 10:13












  • yes it is - Ark

    – Silver
    Mar 7 at 10:14











  • What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

    – Stéphane Chazelas
    Mar 7 at 10:14











  • encoding="UTF-8" - all Questions yes - Stephane.

    – Silver
    Mar 7 at 10:16

















that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

– dgan
Mar 7 at 10:11





that's better be done in python perl etc.. than in pure bash. so OS is irrelevant, it's more stack overflow question

– dgan
Mar 7 at 10:11













Is this XML to JSON?

– Arkadiusz Drabczyk
Mar 7 at 10:13






Is this XML to JSON?

– Arkadiusz Drabczyk
Mar 7 at 10:13














yes it is - Ark

– Silver
Mar 7 at 10:14





yes it is - Ark

– Silver
Mar 7 at 10:14













What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

– Stéphane Chazelas
Mar 7 at 10:14





What kind of assumptions can you make on the input? Are questions and answers always on separate lines, are the <quiz> and </quiz> tags always on their own on a line. Could the text have some encoding (CDATA, {, &eacute....), could it contain double quotes or backslashes? Is the XML file encoded in UTF-8?

– Stéphane Chazelas
Mar 7 at 10:14













encoding="UTF-8" - all Questions yes - Stephane.

– Silver
Mar 7 at 10:16






encoding="UTF-8" - all Questions yes - Stephane.

– Silver
Mar 7 at 10:16











3 Answers
3






active

oldest

votes


















2














I assume your Ubuntu has python installed



#!/usr/bin/python3
import io
import json
import xml.etree.ElementTree

d = """<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>
"""

s = io.StringIO(d)
# root = xml.etree.ElementTree.parse("filename_here").getroot()
root = xml.etree.ElementTree.parse(s).getroot()
out =
i = 1
for child in root:
name, value = child.tag, child.text
if name == 'que':
name = 'question'
else:
name = 'answer%s' % i
i += 1
out[name] = value

print(json.dumps(out))


save it and chmod to executable
you can easily modify to take a file as input instead of just text



EDIT
Okey, this is a more complete script:



#!/usr/bin/python3
import json
import sys
import xml.etree.ElementTree


def read_file(filename):
root = xml.etree.ElementTree.parse(filename).getroot()
return root


# assule we have a list of <quiz>, contained in some other element
def parse_quiz(quiz_element, out):
i = 1
tmp =
for child in quiz_element:

name, value = child.tag, child.text
if name == 'que':
name = 'question'
else:
name = 'answer%s' % i
i += 1
tmp[name] = value
out.append(tmp)


def parse_root(root_element, out):
for child in root_element:
if child.tag == 'quiz':
parse_quiz(child, out)


def convert_xml_to_json(filename):
root = read_file(filename)
out =
parse_root(root, out)
print(json.dumps(out))


if __name__ == '__main__':
if len(sys.argv) > 1:
convert_xml_to_json(sys.argv[1])
else:
print("Usage: script <filename_with_xml>")


I made a file with following, which I named xmltest:



<questions>
<quiz>
<que>The question her</que>
<ca>text</ca>
<ia>text</ia>
<ia>text</ia>
<ia>text</ia>
</quiz>
<quiz>
<que>Question number 1</que>
<ca>blabla</ca>
<ia>stuff</ia>
</quiz>
</questions>


So you have a list of quiz inside some other container.



Now, I launch it like this:
$ chmod u+x scratch.py, then scratch.py filenamewithxml



This gives me the answer:



$ ./scratch4.py xmltest
["answer3": "text", "answer2": "text", "question": "The question her", "answer4": "text", "answer1": "text", "answer2": "stuff", "question": "Question number 1", "answer1": "blabla"]





share|improve this answer

























  • I love this kind of code. Read in XML push out JSON. Wonderful.

    – roaima
    Mar 7 at 10:30











  • That only works if the input has just one <quiz>

    – Stéphane Chazelas
    Mar 7 at 10:36


















0














in fact, you could get away here even w/o Python programming, just using 2 unix utilities:




  1. jtm - that one allows xml <-> json lossless conversion


  2. jtc - that one allows manipulating JSONs

Thus, assuming your xml is in file.xml, jtm would convert it to the following json:



bash $ jtm file.xml 
[

"quiz": [

"que": "The question her"
,

"ca": "text"
,

"ia": "text"
,

"ia": "text"
,

"ia": "text"

]

]
bash $


and then, applying a series of JSON transformations, you can arrive to a desired result:



bash $ jtm file.xml | jtc -w'<quiz>l:[1:][-2]' -ei echo '"answer[-]"': ; -i'<quiz>l:[1:]' | jtc -w'<quiz>l:[-1][:][0]' -w'<quiz>l:[-1][:]' -s | jtc -w'<quiz>l:' -w'<quiz>l:[0]' -s | jtc -w'<quiz>l: <>v' -u'"text"'
[

"answer1": "text",
"answer2": "text",
"answer3": "text",
"answer4": "text",
"text": "The question her"

]
bash $


Though, due to involved shell scripting (echo command), it'll be slower than Python's - for 5000 questions, I'd expect it would run around a minute. (In the future version of the jtc I plan to allow interpolations even in statically specified JSONs, so that for templating no external shell-scripting would be required, then the operations will be blazing fast)



if you're curious about jtc syntax, you could find a user guide here: https://github.com/ldn-softdev/jtc/blob/master/User%20Guide.md






share|improve this answer
































    0














    Thank you dgan, but your code:
    1- print the output in the screen not in json file, and not support encoding= utf-8, so i change it:



     ##!/usr/bin/python3
    import json, codecs
    import sys
    import xml.etree.ElementTree


    def read_file(filename):
    root = xml.etree.ElementTree.parse(filename).getroot()
    return root


    # assule we have a list of <quiz>, contained in some other element
    def parse_quiz(quiz_element, out):
    i = 1
    tmp =
    for child in quiz_element:

    name, value = child.tag, child.text
    if name == 'que':
    name = 'question'
    else:
    name = 'answer%s' % i
    i += 1
    tmp[name] = value
    out.append(tmp)


    def parse_root(root_element, out):
    for child in root_element:
    if child.tag == 'quiz':
    parse_quiz(child, out)


    def convert_xml_to_json(filename):
    root = read_file(filename)
    out =
    parse_root(root, out)
    with open('data.json', 'w') as outfile:
    json.dump(out, codecs.getwriter('utf-8')(outfile), sort_keys=True, ensure_ascii=False)


    if __name__ == '__main__':
    if len(sys.argv) > 1:
    convert_xml_to_json(sys.argv[1])
    else:
    print("Usage: script <filename_with_xml>")


    `






    share|improve this answer

























      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504880%2fscript-to-translate-xml-to-json%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      I assume your Ubuntu has python installed



      #!/usr/bin/python3
      import io
      import json
      import xml.etree.ElementTree

      d = """<quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      """

      s = io.StringIO(d)
      # root = xml.etree.ElementTree.parse("filename_here").getroot()
      root = xml.etree.ElementTree.parse(s).getroot()
      out =
      i = 1
      for child in root:
      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      out[name] = value

      print(json.dumps(out))


      save it and chmod to executable
      you can easily modify to take a file as input instead of just text



      EDIT
      Okey, this is a more complete script:



      #!/usr/bin/python3
      import json
      import sys
      import xml.etree.ElementTree


      def read_file(filename):
      root = xml.etree.ElementTree.parse(filename).getroot()
      return root


      # assule we have a list of <quiz>, contained in some other element
      def parse_quiz(quiz_element, out):
      i = 1
      tmp =
      for child in quiz_element:

      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      tmp[name] = value
      out.append(tmp)


      def parse_root(root_element, out):
      for child in root_element:
      if child.tag == 'quiz':
      parse_quiz(child, out)


      def convert_xml_to_json(filename):
      root = read_file(filename)
      out =
      parse_root(root, out)
      print(json.dumps(out))


      if __name__ == '__main__':
      if len(sys.argv) > 1:
      convert_xml_to_json(sys.argv[1])
      else:
      print("Usage: script <filename_with_xml>")


      I made a file with following, which I named xmltest:



      <questions>
      <quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      <quiz>
      <que>Question number 1</que>
      <ca>blabla</ca>
      <ia>stuff</ia>
      </quiz>
      </questions>


      So you have a list of quiz inside some other container.



      Now, I launch it like this:
      $ chmod u+x scratch.py, then scratch.py filenamewithxml



      This gives me the answer:



      $ ./scratch4.py xmltest
      ["answer3": "text", "answer2": "text", "question": "The question her", "answer4": "text", "answer1": "text", "answer2": "stuff", "question": "Question number 1", "answer1": "blabla"]





      share|improve this answer

























      • I love this kind of code. Read in XML push out JSON. Wonderful.

        – roaima
        Mar 7 at 10:30











      • That only works if the input has just one <quiz>

        – Stéphane Chazelas
        Mar 7 at 10:36















      2














      I assume your Ubuntu has python installed



      #!/usr/bin/python3
      import io
      import json
      import xml.etree.ElementTree

      d = """<quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      """

      s = io.StringIO(d)
      # root = xml.etree.ElementTree.parse("filename_here").getroot()
      root = xml.etree.ElementTree.parse(s).getroot()
      out =
      i = 1
      for child in root:
      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      out[name] = value

      print(json.dumps(out))


      save it and chmod to executable
      you can easily modify to take a file as input instead of just text



      EDIT
      Okey, this is a more complete script:



      #!/usr/bin/python3
      import json
      import sys
      import xml.etree.ElementTree


      def read_file(filename):
      root = xml.etree.ElementTree.parse(filename).getroot()
      return root


      # assule we have a list of <quiz>, contained in some other element
      def parse_quiz(quiz_element, out):
      i = 1
      tmp =
      for child in quiz_element:

      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      tmp[name] = value
      out.append(tmp)


      def parse_root(root_element, out):
      for child in root_element:
      if child.tag == 'quiz':
      parse_quiz(child, out)


      def convert_xml_to_json(filename):
      root = read_file(filename)
      out =
      parse_root(root, out)
      print(json.dumps(out))


      if __name__ == '__main__':
      if len(sys.argv) > 1:
      convert_xml_to_json(sys.argv[1])
      else:
      print("Usage: script <filename_with_xml>")


      I made a file with following, which I named xmltest:



      <questions>
      <quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      <quiz>
      <que>Question number 1</que>
      <ca>blabla</ca>
      <ia>stuff</ia>
      </quiz>
      </questions>


      So you have a list of quiz inside some other container.



      Now, I launch it like this:
      $ chmod u+x scratch.py, then scratch.py filenamewithxml



      This gives me the answer:



      $ ./scratch4.py xmltest
      ["answer3": "text", "answer2": "text", "question": "The question her", "answer4": "text", "answer1": "text", "answer2": "stuff", "question": "Question number 1", "answer1": "blabla"]





      share|improve this answer

























      • I love this kind of code. Read in XML push out JSON. Wonderful.

        – roaima
        Mar 7 at 10:30











      • That only works if the input has just one <quiz>

        – Stéphane Chazelas
        Mar 7 at 10:36













      2












      2








      2







      I assume your Ubuntu has python installed



      #!/usr/bin/python3
      import io
      import json
      import xml.etree.ElementTree

      d = """<quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      """

      s = io.StringIO(d)
      # root = xml.etree.ElementTree.parse("filename_here").getroot()
      root = xml.etree.ElementTree.parse(s).getroot()
      out =
      i = 1
      for child in root:
      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      out[name] = value

      print(json.dumps(out))


      save it and chmod to executable
      you can easily modify to take a file as input instead of just text



      EDIT
      Okey, this is a more complete script:



      #!/usr/bin/python3
      import json
      import sys
      import xml.etree.ElementTree


      def read_file(filename):
      root = xml.etree.ElementTree.parse(filename).getroot()
      return root


      # assule we have a list of <quiz>, contained in some other element
      def parse_quiz(quiz_element, out):
      i = 1
      tmp =
      for child in quiz_element:

      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      tmp[name] = value
      out.append(tmp)


      def parse_root(root_element, out):
      for child in root_element:
      if child.tag == 'quiz':
      parse_quiz(child, out)


      def convert_xml_to_json(filename):
      root = read_file(filename)
      out =
      parse_root(root, out)
      print(json.dumps(out))


      if __name__ == '__main__':
      if len(sys.argv) > 1:
      convert_xml_to_json(sys.argv[1])
      else:
      print("Usage: script <filename_with_xml>")


      I made a file with following, which I named xmltest:



      <questions>
      <quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      <quiz>
      <que>Question number 1</que>
      <ca>blabla</ca>
      <ia>stuff</ia>
      </quiz>
      </questions>


      So you have a list of quiz inside some other container.



      Now, I launch it like this:
      $ chmod u+x scratch.py, then scratch.py filenamewithxml



      This gives me the answer:



      $ ./scratch4.py xmltest
      ["answer3": "text", "answer2": "text", "question": "The question her", "answer4": "text", "answer1": "text", "answer2": "stuff", "question": "Question number 1", "answer1": "blabla"]





      share|improve this answer















      I assume your Ubuntu has python installed



      #!/usr/bin/python3
      import io
      import json
      import xml.etree.ElementTree

      d = """<quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      """

      s = io.StringIO(d)
      # root = xml.etree.ElementTree.parse("filename_here").getroot()
      root = xml.etree.ElementTree.parse(s).getroot()
      out =
      i = 1
      for child in root:
      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      out[name] = value

      print(json.dumps(out))


      save it and chmod to executable
      you can easily modify to take a file as input instead of just text



      EDIT
      Okey, this is a more complete script:



      #!/usr/bin/python3
      import json
      import sys
      import xml.etree.ElementTree


      def read_file(filename):
      root = xml.etree.ElementTree.parse(filename).getroot()
      return root


      # assule we have a list of <quiz>, contained in some other element
      def parse_quiz(quiz_element, out):
      i = 1
      tmp =
      for child in quiz_element:

      name, value = child.tag, child.text
      if name == 'que':
      name = 'question'
      else:
      name = 'answer%s' % i
      i += 1
      tmp[name] = value
      out.append(tmp)


      def parse_root(root_element, out):
      for child in root_element:
      if child.tag == 'quiz':
      parse_quiz(child, out)


      def convert_xml_to_json(filename):
      root = read_file(filename)
      out =
      parse_root(root, out)
      print(json.dumps(out))


      if __name__ == '__main__':
      if len(sys.argv) > 1:
      convert_xml_to_json(sys.argv[1])
      else:
      print("Usage: script <filename_with_xml>")


      I made a file with following, which I named xmltest:



      <questions>
      <quiz>
      <que>The question her</que>
      <ca>text</ca>
      <ia>text</ia>
      <ia>text</ia>
      <ia>text</ia>
      </quiz>
      <quiz>
      <que>Question number 1</que>
      <ca>blabla</ca>
      <ia>stuff</ia>
      </quiz>
      </questions>


      So you have a list of quiz inside some other container.



      Now, I launch it like this:
      $ chmod u+x scratch.py, then scratch.py filenamewithxml



      This gives me the answer:



      $ ./scratch4.py xmltest
      ["answer3": "text", "answer2": "text", "question": "The question her", "answer4": "text", "answer1": "text", "answer2": "stuff", "question": "Question number 1", "answer1": "blabla"]






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Mar 7 at 11:03

























      answered Mar 7 at 10:25









      dgandgan

      13919




      13919












      • I love this kind of code. Read in XML push out JSON. Wonderful.

        – roaima
        Mar 7 at 10:30











      • That only works if the input has just one <quiz>

        – Stéphane Chazelas
        Mar 7 at 10:36

















      • I love this kind of code. Read in XML push out JSON. Wonderful.

        – roaima
        Mar 7 at 10:30











      • That only works if the input has just one <quiz>

        – Stéphane Chazelas
        Mar 7 at 10:36
















      I love this kind of code. Read in XML push out JSON. Wonderful.

      – roaima
      Mar 7 at 10:30





      I love this kind of code. Read in XML push out JSON. Wonderful.

      – roaima
      Mar 7 at 10:30













      That only works if the input has just one <quiz>

      – Stéphane Chazelas
      Mar 7 at 10:36





      That only works if the input has just one <quiz>

      – Stéphane Chazelas
      Mar 7 at 10:36













      0














      in fact, you could get away here even w/o Python programming, just using 2 unix utilities:




      1. jtm - that one allows xml <-> json lossless conversion


      2. jtc - that one allows manipulating JSONs

      Thus, assuming your xml is in file.xml, jtm would convert it to the following json:



      bash $ jtm file.xml 
      [

      "quiz": [

      "que": "The question her"
      ,

      "ca": "text"
      ,

      "ia": "text"
      ,

      "ia": "text"
      ,

      "ia": "text"

      ]

      ]
      bash $


      and then, applying a series of JSON transformations, you can arrive to a desired result:



      bash $ jtm file.xml | jtc -w'<quiz>l:[1:][-2]' -ei echo '"answer[-]"': ; -i'<quiz>l:[1:]' | jtc -w'<quiz>l:[-1][:][0]' -w'<quiz>l:[-1][:]' -s | jtc -w'<quiz>l:' -w'<quiz>l:[0]' -s | jtc -w'<quiz>l: <>v' -u'"text"'
      [

      "answer1": "text",
      "answer2": "text",
      "answer3": "text",
      "answer4": "text",
      "text": "The question her"

      ]
      bash $


      Though, due to involved shell scripting (echo command), it'll be slower than Python's - for 5000 questions, I'd expect it would run around a minute. (In the future version of the jtc I plan to allow interpolations even in statically specified JSONs, so that for templating no external shell-scripting would be required, then the operations will be blazing fast)



      if you're curious about jtc syntax, you could find a user guide here: https://github.com/ldn-softdev/jtc/blob/master/User%20Guide.md






      share|improve this answer





























        0














        in fact, you could get away here even w/o Python programming, just using 2 unix utilities:




        1. jtm - that one allows xml <-> json lossless conversion


        2. jtc - that one allows manipulating JSONs

        Thus, assuming your xml is in file.xml, jtm would convert it to the following json:



        bash $ jtm file.xml 
        [

        "quiz": [

        "que": "The question her"
        ,

        "ca": "text"
        ,

        "ia": "text"
        ,

        "ia": "text"
        ,

        "ia": "text"

        ]

        ]
        bash $


        and then, applying a series of JSON transformations, you can arrive to a desired result:



        bash $ jtm file.xml | jtc -w'<quiz>l:[1:][-2]' -ei echo '"answer[-]"': ; -i'<quiz>l:[1:]' | jtc -w'<quiz>l:[-1][:][0]' -w'<quiz>l:[-1][:]' -s | jtc -w'<quiz>l:' -w'<quiz>l:[0]' -s | jtc -w'<quiz>l: <>v' -u'"text"'
        [

        "answer1": "text",
        "answer2": "text",
        "answer3": "text",
        "answer4": "text",
        "text": "The question her"

        ]
        bash $


        Though, due to involved shell scripting (echo command), it'll be slower than Python's - for 5000 questions, I'd expect it would run around a minute. (In the future version of the jtc I plan to allow interpolations even in statically specified JSONs, so that for templating no external shell-scripting would be required, then the operations will be blazing fast)



        if you're curious about jtc syntax, you could find a user guide here: https://github.com/ldn-softdev/jtc/blob/master/User%20Guide.md






        share|improve this answer



























          0












          0








          0







          in fact, you could get away here even w/o Python programming, just using 2 unix utilities:




          1. jtm - that one allows xml <-> json lossless conversion


          2. jtc - that one allows manipulating JSONs

          Thus, assuming your xml is in file.xml, jtm would convert it to the following json:



          bash $ jtm file.xml 
          [

          "quiz": [

          "que": "The question her"
          ,

          "ca": "text"
          ,

          "ia": "text"
          ,

          "ia": "text"
          ,

          "ia": "text"

          ]

          ]
          bash $


          and then, applying a series of JSON transformations, you can arrive to a desired result:



          bash $ jtm file.xml | jtc -w'<quiz>l:[1:][-2]' -ei echo '"answer[-]"': ; -i'<quiz>l:[1:]' | jtc -w'<quiz>l:[-1][:][0]' -w'<quiz>l:[-1][:]' -s | jtc -w'<quiz>l:' -w'<quiz>l:[0]' -s | jtc -w'<quiz>l: <>v' -u'"text"'
          [

          "answer1": "text",
          "answer2": "text",
          "answer3": "text",
          "answer4": "text",
          "text": "The question her"

          ]
          bash $


          Though, due to involved shell scripting (echo command), it'll be slower than Python's - for 5000 questions, I'd expect it would run around a minute. (In the future version of the jtc I plan to allow interpolations even in statically specified JSONs, so that for templating no external shell-scripting would be required, then the operations will be blazing fast)



          if you're curious about jtc syntax, you could find a user guide here: https://github.com/ldn-softdev/jtc/blob/master/User%20Guide.md






          share|improve this answer















          in fact, you could get away here even w/o Python programming, just using 2 unix utilities:




          1. jtm - that one allows xml <-> json lossless conversion


          2. jtc - that one allows manipulating JSONs

          Thus, assuming your xml is in file.xml, jtm would convert it to the following json:



          bash $ jtm file.xml 
          [

          "quiz": [

          "que": "The question her"
          ,

          "ca": "text"
          ,

          "ia": "text"
          ,

          "ia": "text"
          ,

          "ia": "text"

          ]

          ]
          bash $


          and then, applying a series of JSON transformations, you can arrive to a desired result:



          bash $ jtm file.xml | jtc -w'<quiz>l:[1:][-2]' -ei echo '"answer[-]"': ; -i'<quiz>l:[1:]' | jtc -w'<quiz>l:[-1][:][0]' -w'<quiz>l:[-1][:]' -s | jtc -w'<quiz>l:' -w'<quiz>l:[0]' -s | jtc -w'<quiz>l: <>v' -u'"text"'
          [

          "answer1": "text",
          "answer2": "text",
          "answer3": "text",
          "answer4": "text",
          "text": "The question her"

          ]
          bash $


          Though, due to involved shell scripting (echo command), it'll be slower than Python's - for 5000 questions, I'd expect it would run around a minute. (In the future version of the jtc I plan to allow interpolations even in statically specified JSONs, so that for templating no external shell-scripting would be required, then the operations will be blazing fast)



          if you're curious about jtc syntax, you could find a user guide here: https://github.com/ldn-softdev/jtc/blob/master/User%20Guide.md







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 7 at 15:20

























          answered Mar 7 at 14:10









          DmitryDmitry

          812




          812





















              0














              Thank you dgan, but your code:
              1- print the output in the screen not in json file, and not support encoding= utf-8, so i change it:



               ##!/usr/bin/python3
              import json, codecs
              import sys
              import xml.etree.ElementTree


              def read_file(filename):
              root = xml.etree.ElementTree.parse(filename).getroot()
              return root


              # assule we have a list of <quiz>, contained in some other element
              def parse_quiz(quiz_element, out):
              i = 1
              tmp =
              for child in quiz_element:

              name, value = child.tag, child.text
              if name == 'que':
              name = 'question'
              else:
              name = 'answer%s' % i
              i += 1
              tmp[name] = value
              out.append(tmp)


              def parse_root(root_element, out):
              for child in root_element:
              if child.tag == 'quiz':
              parse_quiz(child, out)


              def convert_xml_to_json(filename):
              root = read_file(filename)
              out =
              parse_root(root, out)
              with open('data.json', 'w') as outfile:
              json.dump(out, codecs.getwriter('utf-8')(outfile), sort_keys=True, ensure_ascii=False)


              if __name__ == '__main__':
              if len(sys.argv) > 1:
              convert_xml_to_json(sys.argv[1])
              else:
              print("Usage: script <filename_with_xml>")


              `






              share|improve this answer





























                0














                Thank you dgan, but your code:
                1- print the output in the screen not in json file, and not support encoding= utf-8, so i change it:



                 ##!/usr/bin/python3
                import json, codecs
                import sys
                import xml.etree.ElementTree


                def read_file(filename):
                root = xml.etree.ElementTree.parse(filename).getroot()
                return root


                # assule we have a list of <quiz>, contained in some other element
                def parse_quiz(quiz_element, out):
                i = 1
                tmp =
                for child in quiz_element:

                name, value = child.tag, child.text
                if name == 'que':
                name = 'question'
                else:
                name = 'answer%s' % i
                i += 1
                tmp[name] = value
                out.append(tmp)


                def parse_root(root_element, out):
                for child in root_element:
                if child.tag == 'quiz':
                parse_quiz(child, out)


                def convert_xml_to_json(filename):
                root = read_file(filename)
                out =
                parse_root(root, out)
                with open('data.json', 'w') as outfile:
                json.dump(out, codecs.getwriter('utf-8')(outfile), sort_keys=True, ensure_ascii=False)


                if __name__ == '__main__':
                if len(sys.argv) > 1:
                convert_xml_to_json(sys.argv[1])
                else:
                print("Usage: script <filename_with_xml>")


                `






                share|improve this answer



























                  0












                  0








                  0







                  Thank you dgan, but your code:
                  1- print the output in the screen not in json file, and not support encoding= utf-8, so i change it:



                   ##!/usr/bin/python3
                  import json, codecs
                  import sys
                  import xml.etree.ElementTree


                  def read_file(filename):
                  root = xml.etree.ElementTree.parse(filename).getroot()
                  return root


                  # assule we have a list of <quiz>, contained in some other element
                  def parse_quiz(quiz_element, out):
                  i = 1
                  tmp =
                  for child in quiz_element:

                  name, value = child.tag, child.text
                  if name == 'que':
                  name = 'question'
                  else:
                  name = 'answer%s' % i
                  i += 1
                  tmp[name] = value
                  out.append(tmp)


                  def parse_root(root_element, out):
                  for child in root_element:
                  if child.tag == 'quiz':
                  parse_quiz(child, out)


                  def convert_xml_to_json(filename):
                  root = read_file(filename)
                  out =
                  parse_root(root, out)
                  with open('data.json', 'w') as outfile:
                  json.dump(out, codecs.getwriter('utf-8')(outfile), sort_keys=True, ensure_ascii=False)


                  if __name__ == '__main__':
                  if len(sys.argv) > 1:
                  convert_xml_to_json(sys.argv[1])
                  else:
                  print("Usage: script <filename_with_xml>")


                  `






                  share|improve this answer















                  Thank you dgan, but your code:
                  1- print the output in the screen not in json file, and not support encoding= utf-8, so i change it:



                   ##!/usr/bin/python3
                  import json, codecs
                  import sys
                  import xml.etree.ElementTree


                  def read_file(filename):
                  root = xml.etree.ElementTree.parse(filename).getroot()
                  return root


                  # assule we have a list of <quiz>, contained in some other element
                  def parse_quiz(quiz_element, out):
                  i = 1
                  tmp =
                  for child in quiz_element:

                  name, value = child.tag, child.text
                  if name == 'que':
                  name = 'question'
                  else:
                  name = 'answer%s' % i
                  i += 1
                  tmp[name] = value
                  out.append(tmp)


                  def parse_root(root_element, out):
                  for child in root_element:
                  if child.tag == 'quiz':
                  parse_quiz(child, out)


                  def convert_xml_to_json(filename):
                  root = read_file(filename)
                  out =
                  parse_root(root, out)
                  with open('data.json', 'w') as outfile:
                  json.dump(out, codecs.getwriter('utf-8')(outfile), sort_keys=True, ensure_ascii=False)


                  if __name__ == '__main__':
                  if len(sys.argv) > 1:
                  convert_xml_to_json(sys.argv[1])
                  else:
                  print("Usage: script <filename_with_xml>")


                  `







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Mar 7 at 16:11

























                  answered Mar 7 at 15:08









                  silversilver

                  133




                  133



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Unix & Linux Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504880%2fscript-to-translate-xml-to-json%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown






                      Popular posts from this blog

                      Peggy Mitchell

                      Palaiologos

                      The Forum (Inglewood, California)