Recursively converting Windows files to Unix files

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I have a PHP application with is located on Linux with multiple directories (and sub-directories) and many PHP, JS, HTML, CSS, etc files. Many of the files have Windows EOL control characters and I am also concerned that some might not be UTF-8 encoded but maybe ISO-8859-1, Windows-1252, etc. My desire is to convert all files to UTF-8 with LF only.

Looks like I might have a couple steps.

The dos2unix man provides this solution:

find . -name *.txt |xargs dos2unix

https://stackoverflow.com/a/11929475 provides this solution:

find . -type f -print0 | xargs -0 dos2unix

https://stackoverflow.com/a/7068241 provides this solution:

find ./ -type f -exec dos2unix ;

I recognize the first will only convert txt files which isn't what I want but I can easily change to target all files using -type f. That being said, is one solution "better" than the other? If so, why? Is it possible to tell which files will be changed without changing them? When I finally change them, I don't want the date to change, and intend to use dos2unix's --keepdate flag. Should any other options be used?

Next, I will need to deal with encoding. https://stackoverflow.com/a/805474/1032531 recommends enca (or its sister command encov) and https://stackoverflow.com/a/64889/1032531 recommends iconv. It also seems like file might be applicable. Again, which one (or maybe something else all together) should be used? I installed enca and when executing enca --list languages, it lists several languages but not english (maybe choose "none"?), and I question is applicability. iconv was already installed, however, it does not have a man page (at least man iconv doesn't result in one). How can this be used to recursively check and convert encoding?

Please confirm/correct my proposed solution or provide a complete solution.

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

@K7AAY I thought it was pretty clear, however, modified added "files" in the sentence "My desire is to convert all files to UTF-8 with LF only". The example in dos2unix's example converts only txt files and not all files.

– user1032531
Mar 16 at 0:21

add a comment |

Looks like I might have a couple steps.

The dos2unix man provides this solution:

find . -name *.txt |xargs dos2unix

https://stackoverflow.com/a/11929475 provides this solution:

find . -type f -print0 | xargs -0 dos2unix

https://stackoverflow.com/a/7068241 provides this solution:

find ./ -type f -exec dos2unix ;

Please confirm/correct my proposed solution or provide a complete solution.

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

@K7AAY I thought it was pretty clear, however, modified added "files" in the sentence "My desire is to convert all files to UTF-8 with LF only". The example in dos2unix's example converts only txt files and not all files.

– user1032531
Mar 16 at 0:21

add a comment |

Looks like I might have a couple steps.

The dos2unix man provides this solution:

find . -name *.txt |xargs dos2unix

https://stackoverflow.com/a/11929475 provides this solution:

find . -type f -print0 | xargs -0 dos2unix

https://stackoverflow.com/a/7068241 provides this solution:

find ./ -type f -exec dos2unix ;

Please confirm/correct my proposed solution or provide a complete solution.

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

Looks like I might have a couple steps.

The dos2unix man provides this solution:

find . -name *.txt |xargs dos2unix

https://stackoverflow.com/a/11929475 provides this solution:

find . -type f -print0 | xargs -0 dos2unix

https://stackoverflow.com/a/7068241 provides this solution:

find ./ -type f -exec dos2unix ;

Please confirm/correct my proposed solution or provide a complete solution.

files unicode recursive newlines

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

edited Mar 16 at 0:18

asked Mar 15 at 13:04

user1032531

58011124

asked Mar 15 at 13:04

user1032531

58011124

asked Mar 15 at 13:04

user1032531

58011124

@K7AAY I thought it was pretty clear, however, modified added "files" in the sentence "My desire is to convert all files to UTF-8 with LF only". The example in dos2unix's example converts only txt files and not all files.

– user1032531
Mar 16 at 0:21

add a comment |

@K7AAY I thought it was pretty clear, however, modified added "files" in the sentence "My desire is to convert all files to UTF-8 with LF only". The example in dos2unix's example converts only txt files and not all files.

– user1032531
Mar 16 at 0:21

@K7AAY I thought it was pretty clear, however, modified added "files" in the sentence "My desire is to convert all files to UTF-8 with LF only". The example in dos2unix's example converts only txt files and not all files.

– user1032531
Mar 16 at 0:21

add a comment |

1 Answer
1

active

oldest

votes

There's quite a few questions here rolled into one.

Firstly when using find I would always use --exec instead of xargs. As a general rule it's better to do things in as few commands as possible. But also the first two methods write all the file names out to a text stream ready for xargs to re-interpret back into file names. Its a needless step which only adds (addmittedly small) opportunity to fail.

dos2unix will accept multiple file names so I would use:

find . -type f -exec dos2unix --keepdate +

This will stack up long lists of files and then kick off dos2unix on a whole bunch of them at once.

To Find out which files will be touch just drop the exec clauses:

find . -type f

Encoding changes are far more problematic. Please be aware that there is no way to reliably determine the current encoding of any text file. It can sometimes be guessed but that is never 100% reliable. So you can only batch process encoding if you are sure all the files are currently the same encoding.

I would recommend using iconv. It really is the default too for this job. You can find a man page for it here:

https://linux.die.net/man/1/iconv

There's a working example of how to use iconv with find here:

https://stackoverflow.com/questions/4544669/batch-convert-latin-1-files-to-utf-8-using-iconv

answered Mar 16 at 1:54

Philip Couling

2,5791123

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506506%2frecursively-converting-windows-files-to-unix-files%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

There's quite a few questions here rolled into one.

dos2unix will accept multiple file names so I would use:

find . -type f -exec dos2unix --keepdate +

This will stack up long lists of files and then kick off dos2unix on a whole bunch of them at once.

To Find out which files will be touch just drop the exec clauses:

find . -type f

I would recommend using iconv. It really is the default too for this job. You can find a man page for it here:

https://linux.die.net/man/1/iconv

There's a working example of how to use iconv with find here:

https://stackoverflow.com/questions/4544669/batch-convert-latin-1-files-to-utf-8-using-iconv

answered Mar 16 at 1:54

Philip Couling

2,5791123

add a comment |

There's quite a few questions here rolled into one.

dos2unix will accept multiple file names so I would use:

find . -type f -exec dos2unix --keepdate +

This will stack up long lists of files and then kick off dos2unix on a whole bunch of them at once.

To Find out which files will be touch just drop the exec clauses:

find . -type f

I would recommend using iconv. It really is the default too for this job. You can find a man page for it here:

https://linux.die.net/man/1/iconv

There's a working example of how to use iconv with find here:

https://stackoverflow.com/questions/4544669/batch-convert-latin-1-files-to-utf-8-using-iconv

answered Mar 16 at 1:54

Philip Couling

2,5791123

add a comment |

There's quite a few questions here rolled into one.

dos2unix will accept multiple file names so I would use:

find . -type f -exec dos2unix --keepdate +

This will stack up long lists of files and then kick off dos2unix on a whole bunch of them at once.

To Find out which files will be touch just drop the exec clauses:

find . -type f

I would recommend using iconv. It really is the default too for this job. You can find a man page for it here:

https://linux.die.net/man/1/iconv

There's a working example of how to use iconv with find here:

https://stackoverflow.com/questions/4544669/batch-convert-latin-1-files-to-utf-8-using-iconv

answered Mar 16 at 1:54

Philip Couling

2,5791123

There's quite a few questions here rolled into one.

dos2unix will accept multiple file names so I would use:

find . -type f -exec dos2unix --keepdate +

This will stack up long lists of files and then kick off dos2unix on a whole bunch of them at once.

To Find out which files will be touch just drop the exec clauses:

find . -type f

I would recommend using iconv. It really is the default too for this job. You can find a man page for it here:

https://linux.die.net/man/1/iconv

There's a working example of how to use iconv with find here:

https://stackoverflow.com/questions/4544669/batch-convert-latin-1-files-to-utf-8-using-iconv

answered Mar 16 at 1:54

Philip Couling

2,5791123

answered Mar 16 at 1:54

Philip Couling

2,5791123

answered Mar 16 at 1:54

Philip Couling

2,5791123

answered Mar 16 at 1:54

Philip Couling

2,5791123

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu