Can I write a console program that works with multiple character encodings? [duplicate]
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
This question already has an answer here:
How to determine the character encoding that a terminal uses in a C/C++ program?
3 answers
I am writing a console program in C.
I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.
But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.
So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?
linux c unicode character-encoding
marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |Â
up vote
1
down vote
favorite
This question already has an answer here:
How to determine the character encoding that a terminal uses in a C/C++ program?
3 answers
I am writing a console program in C.
I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.
But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.
So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?
linux c unicode character-encoding
marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
This question already has an answer here:
How to determine the character encoding that a terminal uses in a C/C++ program?
3 answers
I am writing a console program in C.
I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.
But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.
So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?
linux c unicode character-encoding
This question already has an answer here:
How to determine the character encoding that a terminal uses in a C/C++ program?
3 answers
I am writing a console program in C.
I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.
But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.
So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?
This question already has an answer here:
How to determine the character encoding that a terminal uses in a C/C++ program?
3 answers
linux c unicode character-encoding
edited Jun 6 at 10:32
Jeff Schaller
30.9k846105
30.9k846105
asked Jun 6 at 10:12
user294241
61
61
marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
-4
down vote
UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.
Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.
A decent program calls:
setlocale(LC_ALL, "")
at startup and later uses functions like:
mbtowc(&wc, input, amt)
to convert multibyte input read from stdin or files.
It then processes the data as wide characters and converts it back to multibyte data via:
wctomc(output, wc)
then the output is printed to e.g. stdout.
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using theman
program on the mentioned interfaces.
â schily
Jun 6 at 17:03
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
-4
down vote
UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.
Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.
A decent program calls:
setlocale(LC_ALL, "")
at startup and later uses functions like:
mbtowc(&wc, input, amt)
to convert multibyte input read from stdin or files.
It then processes the data as wide characters and converts it back to multibyte data via:
wctomc(output, wc)
then the output is printed to e.g. stdout.
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using theman
program on the mentioned interfaces.
â schily
Jun 6 at 17:03
add a comment |Â
up vote
-4
down vote
UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.
Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.
A decent program calls:
setlocale(LC_ALL, "")
at startup and later uses functions like:
mbtowc(&wc, input, amt)
to convert multibyte input read from stdin or files.
It then processes the data as wide characters and converts it back to multibyte data via:
wctomc(output, wc)
then the output is printed to e.g. stdout.
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using theman
program on the mentioned interfaces.
â schily
Jun 6 at 17:03
add a comment |Â
up vote
-4
down vote
up vote
-4
down vote
UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.
Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.
A decent program calls:
setlocale(LC_ALL, "")
at startup and later uses functions like:
mbtowc(&wc, input, amt)
to convert multibyte input read from stdin or files.
It then processes the data as wide characters and converts it back to multibyte data via:
wctomc(output, wc)
then the output is printed to e.g. stdout.
UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.
Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.
A decent program calls:
setlocale(LC_ALL, "")
at startup and later uses functions like:
mbtowc(&wc, input, amt)
to convert multibyte input read from stdin or files.
It then processes the data as wide characters and converts it back to multibyte data via:
wctomc(output, wc)
then the output is printed to e.g. stdout.
answered Jun 6 at 11:04
schily
8,63821435
8,63821435
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using theman
program on the mentioned interfaces.
â schily
Jun 6 at 17:03
add a comment |Â
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using theman
program on the mentioned interfaces.
â schily
Jun 6 at 17:03
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â Johan Myréen
Jun 6 at 16:58
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the
man
program on the mentioned interfaces.â schily
Jun 6 at 17:03
You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the
man
program on the mentioned interfaces.â schily
Jun 6 at 17:03
add a comment |Â