(Possible) inconsistent behavior of grep and less

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












1















I have a utf-8 file containing some Turkish text inside. (My system is MacOSX)



$ file -I foo.merge
$foo.merge: text/plain; charset=utf-8


When I try to see some Turkish specific characters by using grep, there is no problem:



$ grep 'Emiroğlu' foo.merge
EMİROĞLU Emiroğlu+Noun+A3sg+Pnon+Nom Emiroğlu+Noun+Prop+Noun+A3sg+P3sg+Nom Emiroğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


I can also see the file by using less command without any problem.



However if I try to do the following, the Turkish characters are not seen properly:



$ grep 'Emir' foo.merge | less
EMİROĞLU ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+A3sg+Pnon+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+P3sg+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


Or the following also doesn't work:



$grep 'Emir' foo.merge > foo2.out
$less foo2.out


What could be the problem? Here is some additional information:



$ locale
LANG="en_US.utf-8"
LC_COLLATE="en_US.utf-8"
LC_CTYPE="en_US.utf-8"
LC_MESSAGES="en_US.utf-8"
LC_MONETARY="en_US.utf-8"
LC_NUMERIC="en_US.utf-8"
LC_TIME="en_US.utf-8"
LC_ALL="en_US.utf-8"









share|improve this question
























  • I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

    – zwlayer
    Feb 3 at 10:43












  • @zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

    – Kusalananda
    Feb 3 at 10:46











  • @Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

    – zwlayer
    Feb 3 at 10:50







  • 1





    Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

    – dave_thompson_085
    Feb 3 at 11:19
















1















I have a utf-8 file containing some Turkish text inside. (My system is MacOSX)



$ file -I foo.merge
$foo.merge: text/plain; charset=utf-8


When I try to see some Turkish specific characters by using grep, there is no problem:



$ grep 'Emiroğlu' foo.merge
EMİROĞLU Emiroğlu+Noun+A3sg+Pnon+Nom Emiroğlu+Noun+Prop+Noun+A3sg+P3sg+Nom Emiroğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


I can also see the file by using less command without any problem.



However if I try to do the following, the Turkish characters are not seen properly:



$ grep 'Emir' foo.merge | less
EMİROĞLU ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+A3sg+Pnon+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+P3sg+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


Or the following also doesn't work:



$grep 'Emir' foo.merge > foo2.out
$less foo2.out


What could be the problem? Here is some additional information:



$ locale
LANG="en_US.utf-8"
LC_COLLATE="en_US.utf-8"
LC_CTYPE="en_US.utf-8"
LC_MESSAGES="en_US.utf-8"
LC_MONETARY="en_US.utf-8"
LC_NUMERIC="en_US.utf-8"
LC_TIME="en_US.utf-8"
LC_ALL="en_US.utf-8"









share|improve this question
























  • I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

    – zwlayer
    Feb 3 at 10:43












  • @zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

    – Kusalananda
    Feb 3 at 10:46











  • @Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

    – zwlayer
    Feb 3 at 10:50







  • 1





    Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

    – dave_thompson_085
    Feb 3 at 11:19














1












1








1


1






I have a utf-8 file containing some Turkish text inside. (My system is MacOSX)



$ file -I foo.merge
$foo.merge: text/plain; charset=utf-8


When I try to see some Turkish specific characters by using grep, there is no problem:



$ grep 'Emiroğlu' foo.merge
EMİROĞLU Emiroğlu+Noun+A3sg+Pnon+Nom Emiroğlu+Noun+Prop+Noun+A3sg+P3sg+Nom Emiroğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


I can also see the file by using less command without any problem.



However if I try to do the following, the Turkish characters are not seen properly:



$ grep 'Emir' foo.merge | less
EMİROĞLU ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+A3sg+Pnon+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+P3sg+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


Or the following also doesn't work:



$grep 'Emir' foo.merge > foo2.out
$less foo2.out


What could be the problem? Here is some additional information:



$ locale
LANG="en_US.utf-8"
LC_COLLATE="en_US.utf-8"
LC_CTYPE="en_US.utf-8"
LC_MESSAGES="en_US.utf-8"
LC_MONETARY="en_US.utf-8"
LC_NUMERIC="en_US.utf-8"
LC_TIME="en_US.utf-8"
LC_ALL="en_US.utf-8"









share|improve this question
















I have a utf-8 file containing some Turkish text inside. (My system is MacOSX)



$ file -I foo.merge
$foo.merge: text/plain; charset=utf-8


When I try to see some Turkish specific characters by using grep, there is no problem:



$ grep 'Emiroğlu' foo.merge
EMİROĞLU Emiroğlu+Noun+A3sg+Pnon+Nom Emiroğlu+Noun+Prop+Noun+A3sg+P3sg+Nom Emiroğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


I can also see the file by using less command without any problem.



However if I try to do the following, the Turkish characters are not seen properly:



$ grep 'Emir' foo.merge | less
EMİROĞLU ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+A3sg+Pnon+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+P3sg+Nom ESC[1;35;40mESC[KEmirESC[mESC[Koğlu+Noun+Prop+Noun+A3sg+Pnon+Nom NOTFOUND


Or the following also doesn't work:



$grep 'Emir' foo.merge > foo2.out
$less foo2.out


What could be the problem? Here is some additional information:



$ locale
LANG="en_US.utf-8"
LC_COLLATE="en_US.utf-8"
LC_CTYPE="en_US.utf-8"
LC_MESSAGES="en_US.utf-8"
LC_MONETARY="en_US.utf-8"
LC_NUMERIC="en_US.utf-8"
LC_TIME="en_US.utf-8"
LC_ALL="en_US.utf-8"






files grep unicode character-encoding less






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 3 at 13:27









Jeff Schaller

42.1k1156133




42.1k1156133










asked Feb 3 at 10:34









user334928user334928

61




61












  • I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

    – zwlayer
    Feb 3 at 10:43












  • @zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

    – Kusalananda
    Feb 3 at 10:46











  • @Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

    – zwlayer
    Feb 3 at 10:50







  • 1





    Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

    – dave_thompson_085
    Feb 3 at 11:19


















  • I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

    – zwlayer
    Feb 3 at 10:43












  • @zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

    – Kusalananda
    Feb 3 at 10:46











  • @Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

    – zwlayer
    Feb 3 at 10:50







  • 1





    Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

    – dave_thompson_085
    Feb 3 at 11:19

















I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

– zwlayer
Feb 3 at 10:43






I guess (s)he meant the Turkish characters are not appeared in the second example as he seeks and this is a problem @Kusalananda

– zwlayer
Feb 3 at 10:43














@zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

– Kusalananda
Feb 3 at 10:46





@zwlayer Oh, I saw EMİROĞLU and thought that was the expected output.

– Kusalananda
Feb 3 at 10:46













@Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

– zwlayer
Feb 3 at 10:50






@Kusalananda Yup, EMİROĞLU is the part of the expected output. But I guess problem occurs in the second part of that output where (s)he gets ESC[1;35;40mESC[KEmirESC[mESC[Koğlu instead of Emiroğlu

– zwlayer
Feb 3 at 10:50





1




1





Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

– dave_thompson_085
Feb 3 at 11:19






Your grep is using terminal control sequences to highlight (color) the matched string, which less does not understand by default. Look for a grep option to turn off coloring (in GNU --color=never or --color=auto) or use less -R to tell it to understand terminal coloring. If you (want to) use these a lot you may want to make them aliases or functions in your shell profile.

– dave_thompson_085
Feb 3 at 11:19











1 Answer
1






active

oldest

votes


















1














The Turkish characters look fine. However grep has inserted colour codes into the output.



Choices:



  • add option --color=never to grep (to remove colour).

  • add option -R to less (to tell less to interpret ASCII colour codes).





share|improve this answer






















    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498414%2fpossible-inconsistent-behavior-of-grep-and-less%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    The Turkish characters look fine. However grep has inserted colour codes into the output.



    Choices:



    • add option --color=never to grep (to remove colour).

    • add option -R to less (to tell less to interpret ASCII colour codes).





    share|improve this answer



























      1














      The Turkish characters look fine. However grep has inserted colour codes into the output.



      Choices:



      • add option --color=never to grep (to remove colour).

      • add option -R to less (to tell less to interpret ASCII colour codes).





      share|improve this answer

























        1












        1








        1







        The Turkish characters look fine. However grep has inserted colour codes into the output.



        Choices:



        • add option --color=never to grep (to remove colour).

        • add option -R to less (to tell less to interpret ASCII colour codes).





        share|improve this answer













        The Turkish characters look fine. However grep has inserted colour codes into the output.



        Choices:



        • add option --color=never to grep (to remove colour).

        • add option -R to less (to tell less to interpret ASCII colour codes).






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 3 at 13:37









        ctrl-alt-delorctrl-alt-delor

        11.7k42159




        11.7k42159



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498414%2fpossible-inconsistent-behavior-of-grep-and-less%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?