file command gives incorrect encoding type
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I have a csv file in hand. When I ran 'file -i filename', it shows that it's encoded as us-ascii
. But when I ran cat filename | csvcut -t -e us-ascii
, I got an error:
"Your file is not "us-ascii" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable"
csvkit documentation can be found here.
I also found the file has HEX codes like 0xd1, which caused some issues. So how do I find the correct encoding of this file? Ideally I would like to convert it to utf-8 encoding. What shall do?
csv character-encoding unicode ascii
add a comment |Â
up vote
0
down vote
favorite
I have a csv file in hand. When I ran 'file -i filename', it shows that it's encoded as us-ascii
. But when I ran cat filename | csvcut -t -e us-ascii
, I got an error:
"Your file is not "us-ascii" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable"
csvkit documentation can be found here.
I also found the file has HEX codes like 0xd1, which caused some issues. So how do I find the correct encoding of this file? Ideally I would like to convert it to utf-8 encoding. What shall do?
csv character-encoding unicode ascii
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran thefile -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess thefile -i
only looks at a portion of a file and draws a conclusion?
â user3768495
Sep 10 at 19:32
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can trychardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardet
â pi0tr
Sep 10 at 20:45
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a csv file in hand. When I ran 'file -i filename', it shows that it's encoded as us-ascii
. But when I ran cat filename | csvcut -t -e us-ascii
, I got an error:
"Your file is not "us-ascii" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable"
csvkit documentation can be found here.
I also found the file has HEX codes like 0xd1, which caused some issues. So how do I find the correct encoding of this file? Ideally I would like to convert it to utf-8 encoding. What shall do?
csv character-encoding unicode ascii
I have a csv file in hand. When I ran 'file -i filename', it shows that it's encoded as us-ascii
. But when I ran cat filename | csvcut -t -e us-ascii
, I got an error:
"Your file is not "us-ascii" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable"
csvkit documentation can be found here.
I also found the file has HEX codes like 0xd1, which caused some issues. So how do I find the correct encoding of this file? Ideally I would like to convert it to utf-8 encoding. What shall do?
csv character-encoding unicode ascii
csv character-encoding unicode ascii
edited Sep 15 at 15:59
Rui F Ribeiro
36.8k1273117
36.8k1273117
asked Sep 10 at 19:02
user3768495
1303
1303
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran thefile -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess thefile -i
only looks at a portion of a file and draws a conclusion?
â user3768495
Sep 10 at 19:32
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can trychardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardet
â pi0tr
Sep 10 at 20:45
add a comment |Â
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran thefile -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess thefile -i
only looks at a portion of a file and draws a conclusion?
â user3768495
Sep 10 at 19:32
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can trychardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardet
â pi0tr
Sep 10 at 20:45
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran the
file -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess the file -i
only looks at a portion of a file and draws a conclusion?â user3768495
Sep 10 at 19:32
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran the
file -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess the file -i
only looks at a portion of a file and draws a conclusion?â user3768495
Sep 10 at 19:32
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can try
chardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardetâ pi0tr
Sep 10 at 20:45
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can try
chardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardetâ pi0tr
Sep 10 at 20:45
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f468084%2ffile-command-gives-incorrect-encoding-type%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
For example include a part of your txt file in your question.
â Ipor Sircer
Sep 10 at 19:04
EBCDIC ? ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/â¦
â steve
Sep 10 at 19:22
Hi @IporSircer, to prepare the example file, I took a small part of the file out and made sure it contains the row of 0xd1 which caused trouble. Then I re-ran the
file -i
command and this time it says the encoding is iso-8559-1. And that seems to be the correct encoding for the original file. So I guess thefile -i
only looks at a portion of a file and draws a conclusion?â user3768495
Sep 10 at 19:32
Bottom line? Determining the correct encoding of a file is actually very, very difficult, unfortunately.If Python is an option, you can try
chardet
which gives great results in my opinion. FYI even chardet will give you results with a confidence interval (which is pretty smart!). pypi.org/project/chardetâ pi0tr
Sep 10 at 20:45