How do I convert a troff manpage with UTF-8 characters (czech to be precise) to PDF

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
1












I have a troff document (manpage) with UTF-8 characters and I am trying to convert it to a PDF. However, when using the -Tpdf option, the PDF generated does not show the correct characters. This is the command I am using:



groff -k -Tutf-8 -pet -Tpdf -mandoc filename.1 > filename.pdf


Examples of what goes wrong:



"Používá" becomes "Pou3⁄4ívá"

"překladač" becomes "pøekladaè"

"prováděných" becomes "provádìných"

"rozšířením" becomes "roz1íøením"



How to do it correctly?










share|improve this question



















  • 1




    Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
    – Fox
    Apr 25 '17 at 20:13










  • Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
    – magnusi
    Apr 25 '17 at 21:34






  • 1




    What about -K utf-8?
    – Gilles
    Apr 25 '17 at 22:06










  • I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
    – Fox
    Apr 25 '17 at 22:38






  • 1




    See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
    – Fox
    Apr 25 '17 at 23:10














up vote
3
down vote

favorite
1












I have a troff document (manpage) with UTF-8 characters and I am trying to convert it to a PDF. However, when using the -Tpdf option, the PDF generated does not show the correct characters. This is the command I am using:



groff -k -Tutf-8 -pet -Tpdf -mandoc filename.1 > filename.pdf


Examples of what goes wrong:



"Používá" becomes "Pou3⁄4ívá"

"překladač" becomes "pøekladaè"

"prováděných" becomes "provádìných"

"rozšířením" becomes "roz1íøením"



How to do it correctly?










share|improve this question



















  • 1




    Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
    – Fox
    Apr 25 '17 at 20:13










  • Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
    – magnusi
    Apr 25 '17 at 21:34






  • 1




    What about -K utf-8?
    – Gilles
    Apr 25 '17 at 22:06










  • I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
    – Fox
    Apr 25 '17 at 22:38






  • 1




    See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
    – Fox
    Apr 25 '17 at 23:10












up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





I have a troff document (manpage) with UTF-8 characters and I am trying to convert it to a PDF. However, when using the -Tpdf option, the PDF generated does not show the correct characters. This is the command I am using:



groff -k -Tutf-8 -pet -Tpdf -mandoc filename.1 > filename.pdf


Examples of what goes wrong:



"Používá" becomes "Pou3⁄4ívá"

"překladač" becomes "pøekladaè"

"prováděných" becomes "provádìných"

"rozšířením" becomes "roz1íøením"



How to do it correctly?










share|improve this question















I have a troff document (manpage) with UTF-8 characters and I am trying to convert it to a PDF. However, when using the -Tpdf option, the PDF generated does not show the correct characters. This is the command I am using:



groff -k -Tutf-8 -pet -Tpdf -mandoc filename.1 > filename.pdf


Examples of what goes wrong:



"Používá" becomes "Pou3⁄4ívá"

"překladač" becomes "pøekladaè"

"prováděných" becomes "provádìných"

"rozšířením" becomes "roz1íøením"



How to do it correctly?







pdf unicode conversion groff roff






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 25 '17 at 22:07









Gilles

518k12410341563




518k12410341563










asked Apr 25 '17 at 19:37









magnusi

161




161







  • 1




    Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
    – Fox
    Apr 25 '17 at 20:13










  • Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
    – magnusi
    Apr 25 '17 at 21:34






  • 1




    What about -K utf-8?
    – Gilles
    Apr 25 '17 at 22:06










  • I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
    – Fox
    Apr 25 '17 at 22:38






  • 1




    See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
    – Fox
    Apr 25 '17 at 23:10












  • 1




    Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
    – Fox
    Apr 25 '17 at 20:13










  • Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
    – magnusi
    Apr 25 '17 at 21:34






  • 1




    What about -K utf-8?
    – Gilles
    Apr 25 '17 at 22:06










  • I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
    – Fox
    Apr 25 '17 at 22:38






  • 1




    See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
    – Fox
    Apr 25 '17 at 23:10







1




1




Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
– Fox
Apr 25 '17 at 20:13




Are you sure you want -Tutf-8? From a glance at the manual, it seems like your two -T options might be conflicting, and -D is used to select a charset
– Fox
Apr 25 '17 at 20:13












Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
– magnusi
Apr 25 '17 at 21:34




Ah, sorry, I am not sure, I was just already very desparate. Using -D utf-8 makes the problematic characters disappear completely, resulting in 'can't find special character' warnings
– magnusi
Apr 25 '17 at 21:34




1




1




What about -K utf-8?
– Gilles
Apr 25 '17 at 22:06




What about -K utf-8?
– Gilles
Apr 25 '17 at 22:06












I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
– Fox
Apr 25 '17 at 22:38




I'm testing things now. Interestingly, if I use -D utf-8 or -K utf-8 along with -T utf-8, I can see the Czech symbols, but when I change -T utf-8 to -T ps or -T pdf the problem arises. So grotty acts fine with Unicode, but grops and gropdf are having trouble
– Fox
Apr 25 '17 at 22:38




1




1




See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
– Fox
Apr 25 '17 at 23:10




See also The Linux Documentation Project stating that groff -Tps is only capable of outputting the Unicode characters that "PostScript supports by itself" (though it doesn't specify which version of groff). Since PDF is just compiled PostScript, this is probably the issue.
– Fox
Apr 25 '17 at 23:10










1 Answer
1






active

oldest

votes

















up vote
1
down vote













The following convoluted way works:



groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
dvipdfm -cz 9 test.dvi
open test.pdf


Via the [Groff] latin2 polish special characters thread on lists.gnu.org.






share|improve this answer








New contributor




Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

















    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f361274%2fhow-do-i-convert-a-troff-manpage-with-utf-8-characters-czech-to-be-precise-to%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    The following convoluted way works:



    groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
    dvipdfm -cz 9 test.dvi
    open test.pdf


    Via the [Groff] latin2 polish special characters thread on lists.gnu.org.






    share|improve this answer








    New contributor




    Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      1
      down vote













      The following convoluted way works:



      groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
      dvipdfm -cz 9 test.dvi
      open test.pdf


      Via the [Groff] latin2 polish special characters thread on lists.gnu.org.






      share|improve this answer








      New contributor




      Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.



















        up vote
        1
        down vote










        up vote
        1
        down vote









        The following convoluted way works:



        groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
        dvipdfm -cz 9 test.dvi
        open test.pdf


        Via the [Groff] latin2 polish special characters thread on lists.gnu.org.






        share|improve this answer








        New contributor




        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        The following convoluted way works:



        groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
        dvipdfm -cz 9 test.dvi
        open test.pdf


        Via the [Groff] latin2 polish special characters thread on lists.gnu.org.







        share|improve this answer








        New contributor




        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer






        New contributor




        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered 11 mins ago









        Marek Kowalczyk

        134




        134




        New contributor




        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Marek Kowalczyk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f361274%2fhow-do-i-convert-a-troff-manpage-with-utf-8-characters-czech-to-be-precise-to%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            How many registers does an x86_64 CPU actually have?

            Nur Jahan