CONVMV and cyrillic filenames

up vote
3
down vote

favorite

I am trying to convert filenames in russian zipfile to UTF-8 using convmv.
Original filename: "ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg" (ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg with slashes)

This analyzer (https://2cyr.com/decode/?lang=en) detected source encoding CP866 + displayed as CP437, and successfully decodes to desired ÃÂ¡ ÃÂ²ÃÂ°ÃÂ½ÃÂ¸ÃÂ»Ã‘ÂŒÃÂ½Ã‘Â‹ÃÂ¼ ÃÂ°Ã‘Â€ÃÂ¾ÃÂ¼ÃÂ°Ã‘Â‚ÃÂ¾ÃÂ¼.jpg.

My question is, how can I set up convmv to decode it properly?
For convmv -f cp866 -t utf-8 filename, I get "already UTF-8", in --nosmart mode I get jibberish.

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

add a commentÂ |Â

up vote
3
down vote

favorite

My question is, how can I set up convmv to decode it properly?
For convmv -f cp866 -t utf-8 filename, I get "already UTF-8", in --nosmart mode I get jibberish.

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

add a commentÂ |Â

up vote
3
down vote

favorite

My question is, how can I set up convmv to decode it properly?
For convmv -f cp866 -t utf-8 filename, I get "already UTF-8", in --nosmart mode I get jibberish.

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

My question is, how can I set up convmv to decode it properly?
For convmv -f cp866 -t utf-8 filename, I get "already UTF-8", in --nosmart mode I get jibberish.

linux character-encoding

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

edited Sep 9 at 12:10

Jeff Schaller

33.1k849111

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

asked Sep 8 at 17:20

Adam PlÃ…Â¡ek

182

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

A single pass of convmv can only fix one level of wrong-ness at a time. Your particular file is more complicated. It was originally cp866, and it was at some point turned into UTF-8, but whatever did that conversion thought it was originally cp437 and so converted it wrong. To fix this, you need to run convmv twice:

convmv -f utf-8 -t cp437 --notest 'ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg'
convmv -f cp866 -t utf-8 --notest $'x91 xa2xa0xadxa8xabxecxadxebxac xa0xe0xaexacxa0xe2xaexac.jpg'

Note that I had to escape the name in the second command. If you're running this in bulk and/or don't want to deal with doing that, you can use something like *.jpg, or put all of the affected files in their own directory and use the -r option to convmv.

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f467744%2fconvmv-and-cyrillic-filenames%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

convmv -f utf-8 -t cp437 --notest 'ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg'
convmv -f cp866 -t utf-8 --notest $'x91 xa2xa0xadxa8xabxecxadxebxac xa0xe0xaexacxa0xe2xaexac.jpg'

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

add a commentÂ |Â

up vote
2
down vote

accepted

convmv -f utf-8 -t cp437 --notest 'ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg'
convmv -f cp866 -t utf-8 --notest $'x91 xa2xa0xadxa8xabxecxadxebxac xa0xe0xaexacxa0xe2xaexac.jpg'

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

add a commentÂ |Â

up vote
2
down vote

accepted

convmv -f utf-8 -t cp437 --notest 'ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg'
convmv -f cp866 -t utf-8 --notest $'x91 xa2xa0xadxa8xabxecxadxebxac xa0xe0xaexacxa0xe2xaexac.jpg'

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

convmv -f utf-8 -t cp437 --notest 'ÃƒÂ¦ ÃƒÂ³ÃƒÂ¡Ã‚Â¡Ã‚Â¿Ã‚Â½Ã¢ÂˆÂžÃ‚Â¡ÃŽÂ´Ã‚Â¼ ÃƒÂ¡ÃŽÂ±Ã‚Â«Ã‚Â¼ÃƒÂ¡ÃŽÂ“Ã‚Â«Ã‚Â¼.jpg'
convmv -f cp866 -t utf-8 --notest $'x91 xa2xa0xadxa8xabxecxadxebxac xa0xe0xaexacxa0xe2xaexac.jpg'

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

edited Sep 8 at 19:11

answered Sep 8 at 19:05

Joseph Sible

941213

answered Sep 8 at 19:05

Joseph Sible

941213

answered Sep 8 at 19:05

Joseph Sible

941213

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

add a commentÂ |Â

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

Now its clear, thank you.
â€“Â Adam PlÃ…Â¡ek
Sep 8 at 19:58

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu