Can I write a console program that works with multiple character encodings? [duplicate]

up vote
1
down vote

favorite

This question already has an answer here:

How to determine the character encoding that a terminal uses in a C/C++ program?

3 answers

I am writing a console program in C.

I expect the Terminal that my program is running in to have its character encoding set to UTF-8. This means that I am sending UTF-8 encoded strings to the Terminal, and expecting to receive UTF-8 encoded strings from the Terminal.

But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.

So is there a way to know what character encoding the Terminal is set to from within my program (so that I can change my program behavior accordingly)? And even if there is such a way, should I even bother making my program work with multiple character encodings, or is it enough to only make it work with UTF-8?

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

asked Jun 6 at 10:12

user294241

marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

up vote
1
down vote

favorite

This question already has an answer here:

How to determine the character encoding that a terminal uses in a C/C++ program?

3 answers

I am writing a console program in C.

But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

asked Jun 6 at 10:12

user294241

marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

up vote
1
down vote

favorite

This question already has an answer here:

How to determine the character encoding that a terminal uses in a C/C++ program?

3 answers

I am writing a console program in C.

But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

asked Jun 6 at 10:12

user294241

This question already has an answer here:

How to determine the character encoding that a terminal uses in a C/C++ program?

3 answers

I am writing a console program in C.

But if the Terminal was set to another character encoding (other than UTF-8) while my program is running, then my program will stop working as expected.

This question already has an answer here:

How to determine the character encoding that a terminal uses in a C/C++ program?

3 answers

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

asked Jun 6 at 10:12

user294241

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

edited Jun 6 at 10:32

Jeff Schaller

30.9k846105

asked Jun 6 at 10:12

user294241

asked Jun 6 at 10:12

user294241

asked Jun 6 at 10:12

user294241

marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by JdeBP, meuh, Thomas, Jesse_b, Jeff Schaller Jun 6 at 16:25

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
-4
down vote

UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.

Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.

A decent program calls:

setlocale(LC_ALL, "")

at startup and later uses functions like:

mbtowc(&wc, input, amt)

to convert multibyte input read from stdin or files.

It then processes the data as wide characters and converts it back to multibyte data via:

wctomc(output, wc)

then the output is printed to e.g. stdout.

answered Jun 6 at 11:04

schily

8,63821435

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

add a commentÂ |Â

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
-4
down vote

UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.

Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.

A decent program calls:

setlocale(LC_ALL, "")

at startup and later uses functions like:

mbtowc(&wc, input, amt)

to convert multibyte input read from stdin or files.

It then processes the data as wide characters and converts it back to multibyte data via:

wctomc(output, wc)

then the output is printed to e.g. stdout.

answered Jun 6 at 11:04

schily

8,63821435

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

add a commentÂ |Â

up vote
-4
down vote

UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.

Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.

A decent program calls:

setlocale(LC_ALL, "")

at startup and later uses functions like:

mbtowc(&wc, input, amt)

to convert multibyte input read from stdin or files.

It then processes the data as wide characters and converts it back to multibyte data via:

wctomc(output, wc)

then the output is printed to e.g. stdout.

answered Jun 6 at 11:04

schily

8,63821435

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

add a commentÂ |Â

up vote
-4
down vote

UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.

Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.

A decent program calls:

setlocale(LC_ALL, "")

at startup and later uses functions like:

mbtowc(&wc, input, amt)

to convert multibyte input read from stdin or files.

It then processes the data as wide characters and converts it back to multibyte data via:

wctomc(output, wc)

then the output is printed to e.g. stdout.

answered Jun 6 at 11:04

schily

8,63821435

UTF-8 has several pitfalls and for this reason is not the typical encoding in central Europe.

Writing programs that assume UTF-8 is bad practice as you may not be able to even know where a "character" ends in the byte stream.

A decent program calls:

setlocale(LC_ALL, "")

at startup and later uses functions like:

mbtowc(&wc, input, amt)

to convert multibyte input read from stdin or files.

It then processes the data as wide characters and converts it back to multibyte data via:

wctomc(output, wc)

then the output is printed to e.g. stdout.

answered Jun 6 at 11:04

schily

8,63821435

answered Jun 6 at 11:04

schily

8,63821435

answered Jun 6 at 11:04

schily

8,63821435

answered Jun 6 at 11:04

schily

8,63821435

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

add a commentÂ |Â

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

UTF-8 is the only sensible external encoding for Unicode text. Your answer does not consider how to choose between different encodings, and thus does not answer the question at all.
â€“Â Johan MyrÃ©en
Jun 6 at 16:58

You are mistaken. Unicode causes problems that people did not expect. Many people for this reason use ISO-8859-1. The question does not ask how to set up a different encode, just how to deal with different encodings. So my answer is a good starter for further reading, by e.g. using the man program on the mentioned interfaces.
â€“Â schily
Jun 6 at 17:03

add a commentÂ |Â

搜尋此網誌

mjhjmtu