Can I write a console program that works with multiple character encodings? [duplicate]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite













This question already has an answer here:



  • How to determine the character encoding that a terminal uses in a C/C++ program?

    3 answers



I am writing a console program in C.



I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.



But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.



So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?







share|improve this question













marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


















    up vote
    1
    down vote

    favorite













    This question already has an answer here:



    • How to determine the character encoding that a terminal uses in a C/C++ program?

      3 answers



    I am writing a console program in C.



    I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.



    But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.



    So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?







    share|improve this question













    marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
















      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite












      This question already has an answer here:



      • How to determine the character encoding that a terminal uses in a C/C++ program?

        3 answers



      I am writing a console program in C.



      I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.



      But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.



      So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?







      share|improve this question














      This question already has an answer here:



      • How to determine the character encoding that a terminal uses in a C/C++ program?

        3 answers



      I am writing a console program in C.



      I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.



      But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.



      So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?





      This question already has an answer here:



      • How to determine the character encoding that a terminal uses in a C/C++ program?

        3 answers









      share|improve this question












      share|improve this question




      share|improve this question








      edited Jun 6 at 10:32









      Jeff Schaller

      30.9k846105




      30.9k846105









      asked Jun 6 at 10:12









      user294241

      61




      61




      marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






      marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          -4
          down vote













          UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.



          Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.



          A decent program calls:



          setlocale(LC_ALL, "")


          at startup and later uses functions like:



          mbtowc(&wc, input, amt)


          to convert multibyte input read from stdin or files.



          It then processes the data as wide characters and converts it back to multibyte data via:



          wctomc(output, wc)


          then the output is printed to e.g. stdout.






          share|improve this answer





















          • UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
            – Johan Myréen
            Jun 6 at 16:58










          • You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
            – schily
            Jun 6 at 17:03

















          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          -4
          down vote













          UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.



          Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.



          A decent program calls:



          setlocale(LC_ALL, "")


          at startup and later uses functions like:



          mbtowc(&wc, input, amt)


          to convert multibyte input read from stdin or files.



          It then processes the data as wide characters and converts it back to multibyte data via:



          wctomc(output, wc)


          then the output is printed to e.g. stdout.






          share|improve this answer





















          • UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
            – Johan Myréen
            Jun 6 at 16:58










          • You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
            – schily
            Jun 6 at 17:03














          up vote
          -4
          down vote













          UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.



          Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.



          A decent program calls:



          setlocale(LC_ALL, "")


          at startup and later uses functions like:



          mbtowc(&wc, input, amt)


          to convert multibyte input read from stdin or files.



          It then processes the data as wide characters and converts it back to multibyte data via:



          wctomc(output, wc)


          then the output is printed to e.g. stdout.






          share|improve this answer





















          • UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
            – Johan Myréen
            Jun 6 at 16:58










          • You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
            – schily
            Jun 6 at 17:03












          up vote
          -4
          down vote










          up vote
          -4
          down vote









          UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.



          Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.



          A decent program calls:



          setlocale(LC_ALL, "")


          at startup and later uses functions like:



          mbtowc(&wc, input, amt)


          to convert multibyte input read from stdin or files.



          It then processes the data as wide characters and converts it back to multibyte data via:



          wctomc(output, wc)


          then the output is printed to e.g. stdout.






          share|improve this answer













          UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.



          Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.



          A decent program calls:



          setlocale(LC_ALL, "")


          at startup and later uses functions like:



          mbtowc(&wc, input, amt)


          to convert multibyte input read from stdin or files.



          It then processes the data as wide characters and converts it back to multibyte data via:



          wctomc(output, wc)


          then the output is printed to e.g. stdout.







          share|improve this answer













          share|improve this answer



          share|improve this answer











          answered Jun 6 at 11:04









          schily

          8,63821435




          8,63821435











          • UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
            – Johan Myréen
            Jun 6 at 16:58










          • You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
            – schily
            Jun 6 at 17:03
















          • UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
            – Johan Myréen
            Jun 6 at 16:58










          • You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
            – schily
            Jun 6 at 17:03















          UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
          – Johan Myréen
          Jun 6 at 16:58




          UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
          – Johan Myréen
          Jun 6 at 16:58












          You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
          – schily
          Jun 6 at 17:03




          You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
          – schily
          Jun 6 at 17:03


          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          Displaying single band from multi-band raster using QGIS

          How many registers does an x86_64 CPU actually have?