Are there cases where fontenc + luatex (or xetex) cause problems?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












17















According to the luatex docs, you shouldn't use fontenc with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec only when handling the latter two.)



I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?










share|improve this question



















  • 2





    Great first question! I learned some new things by researching the answer. Thanks.

    – Davislor
    Jan 20 at 4:21











  • examples how to use both traditional font encodings and OpenType fonts in same document are given there

    – user4686
    Jan 20 at 8:54















17















According to the luatex docs, you shouldn't use fontenc with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec only when handling the latter two.)



I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?










share|improve this question



















  • 2





    Great first question! I learned some new things by researching the answer. Thanks.

    – Davislor
    Jan 20 at 4:21











  • examples how to use both traditional font encodings and OpenType fonts in same document are given there

    – user4686
    Jan 20 at 8:54













17












17








17


3






According to the luatex docs, you shouldn't use fontenc with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec only when handling the latter two.)



I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?










share|improve this question
















According to the luatex docs, you shouldn't use fontenc with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec only when handling the latter two.)



I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?







fonts xetex luatex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 20 at 9:10









Joseph Wright

203k22559885




203k22559885










asked Jan 20 at 3:10









karldwkarldw

884




884







  • 2





    Great first question! I learned some new things by researching the answer. Thanks.

    – Davislor
    Jan 20 at 4:21











  • examples how to use both traditional font encodings and OpenType fonts in same document are given there

    – user4686
    Jan 20 at 8:54












  • 2





    Great first question! I learned some new things by researching the answer. Thanks.

    – Davislor
    Jan 20 at 4:21











  • examples how to use both traditional font encodings and OpenType fonts in same document are given there

    – user4686
    Jan 20 at 8:54







2




2





Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21





Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21













examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54





examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54










3 Answers
3






active

oldest

votes


















16














fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.



But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument


will load t1enc.def, lgrenc.def and t2enc.def.



This also is not problematic with lualatex and xelatex.



But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
Grüße, αβγ, Ҍҋ

enddocument


enter image description here



So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:



documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A


Moving the setmainfont resolves the problem:



documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now





share|improve this answer

























  • The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

    – Davislor
    Jan 20 at 12:14












  • I took the liberty of replacing the fix at the end with one that really does work. Mostly.

    – Davislor
    Jan 20 at 13:38











  • @Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

    – Ulrike Fischer
    Jan 20 at 13:41






  • 1





    As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

    – Davislor
    Jan 20 at 14:01







  • 2





    @Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

    – Ulrike Fischer
    Jan 20 at 14:10


















12














An Example that Might Bite You



It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.



documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.



The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.



How to Combine Unicode and Legacy Fonts



I’m not judging. Sometimes I don’t get to set the requirements.



To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:



documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



A Better Solution



A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:



documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX
setmainfontLingua Franca[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.






share|improve this answer

























  • I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

    – David Carlisle
    Jan 20 at 22:59











  • @DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

    – Davislor
    Jan 21 at 0:01












  • @DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

    – Davislor
    Jan 21 at 0:12












  • No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

    – David Carlisle
    Jan 21 at 0:46











  • @DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

    – Davislor
    Jan 21 at 1:09


















10














Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by



usepackage[T1][fontenc}


so it is simplest to tell people not to use fontenc.



In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.



The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.






share|improve this answer

























  • Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

    – moewe
    Jan 20 at 12:51












  • @moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

    – David Carlisle
    Jan 20 at 13:25











  • I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

    – moewe
    Jan 20 at 13:36






  • 1





    @moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

    – David Carlisle
    yesterday






  • 1





    @moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

    – Ulrike Fischer
    yesterday










Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "85"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f470976%2fare-there-cases-where-fontenc-luatex-or-xetex-cause-problems%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









16














fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.



But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument


will load t1enc.def, lgrenc.def and t2enc.def.



This also is not problematic with lualatex and xelatex.



But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
Grüße, αβγ, Ҍҋ

enddocument


enter image description here



So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:



documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A


Moving the setmainfont resolves the problem:



documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now





share|improve this answer

























  • The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

    – Davislor
    Jan 20 at 12:14












  • I took the liberty of replacing the fix at the end with one that really does work. Mostly.

    – Davislor
    Jan 20 at 13:38











  • @Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

    – Ulrike Fischer
    Jan 20 at 13:41






  • 1





    As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

    – Davislor
    Jan 20 at 14:01







  • 2





    @Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

    – Ulrike Fischer
    Jan 20 at 14:10















16














fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.



But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument


will load t1enc.def, lgrenc.def and t2enc.def.



This also is not problematic with lualatex and xelatex.



But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
Grüße, αβγ, Ҍҋ

enddocument


enter image description here



So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:



documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A


Moving the setmainfont resolves the problem:



documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now





share|improve this answer

























  • The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

    – Davislor
    Jan 20 at 12:14












  • I took the liberty of replacing the fix at the end with one that really does work. Mostly.

    – Davislor
    Jan 20 at 13:38











  • @Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

    – Ulrike Fischer
    Jan 20 at 13:41






  • 1





    As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

    – Davislor
    Jan 20 at 14:01







  • 2





    @Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

    – Ulrike Fischer
    Jan 20 at 14:10













16












16








16







fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.



But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument


will load t1enc.def, lgrenc.def and t2enc.def.



This also is not problematic with lualatex and xelatex.



But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
Grüße, αβγ, Ҍҋ

enddocument


enter image description here



So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:



documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A


Moving the setmainfont resolves the problem:



documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now





share|improve this answer















fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.



But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument


will load t1enc.def, lgrenc.def and t2enc.def.



This also is not problematic with lualatex and xelatex.



But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:



documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
Grüße, αβγ, Ҍҋ

enddocument


enter image description here



So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:



documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A


Moving the setmainfont resolves the problem:



documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now






share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 23 at 9:53









Joseph Wright

203k22559885




203k22559885










answered Jan 20 at 8:40









Ulrike FischerUlrike Fischer

191k8298680




191k8298680












  • The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

    – Davislor
    Jan 20 at 12:14












  • I took the liberty of replacing the fix at the end with one that really does work. Mostly.

    – Davislor
    Jan 20 at 13:38











  • @Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

    – Ulrike Fischer
    Jan 20 at 13:41






  • 1





    As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

    – Davislor
    Jan 20 at 14:01







  • 2





    @Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

    – Ulrike Fischer
    Jan 20 at 14:10

















  • The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

    – Davislor
    Jan 20 at 12:14












  • I took the liberty of replacing the fix at the end with one that really does work. Mostly.

    – Davislor
    Jan 20 at 13:38











  • @Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

    – Ulrike Fischer
    Jan 20 at 13:41






  • 1





    As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

    – Davislor
    Jan 20 at 14:01







  • 2





    @Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

    – Ulrike Fischer
    Jan 20 at 14:10
















The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14






The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14














I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38





I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38













@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41





@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41




1




1





As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01






As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01





2




2





@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10





@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10











12














An Example that Might Bite You



It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.



documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.



The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.



How to Combine Unicode and Legacy Fonts



I’m not judging. Sometimes I don’t get to set the requirements.



To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:



documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



A Better Solution



A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:



documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX
setmainfontLingua Franca[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.






share|improve this answer

























  • I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

    – David Carlisle
    Jan 20 at 22:59











  • @DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

    – Davislor
    Jan 21 at 0:01












  • @DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

    – Davislor
    Jan 21 at 0:12












  • No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

    – David Carlisle
    Jan 21 at 0:46











  • @DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

    – Davislor
    Jan 21 at 1:09















12














An Example that Might Bite You



It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.



documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.



The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.



How to Combine Unicode and Legacy Fonts



I’m not judging. Sometimes I don’t get to set the requirements.



To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:



documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



A Better Solution



A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:



documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX
setmainfontLingua Franca[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.






share|improve this answer

























  • I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

    – David Carlisle
    Jan 20 at 22:59











  • @DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

    – Davislor
    Jan 21 at 0:01












  • @DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

    – Davislor
    Jan 21 at 0:12












  • No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

    – David Carlisle
    Jan 21 at 0:46











  • @DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

    – Davislor
    Jan 21 at 1:09













12












12








12







An Example that Might Bite You



It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.



documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.



The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.



How to Combine Unicode and Legacy Fonts



I’m not judging. Sometimes I don’t get to set the requirements.



To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:



documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



A Better Solution



A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:



documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX
setmainfontLingua Franca[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.






share|improve this answer















An Example that Might Bite You



It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.



documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.



The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.



How to Combine Unicode and Legacy Fonts



I’m not judging. Sometimes I don’t get to set the requirements.



To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:



documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



A Better Solution



A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:



documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX
setmainfontLingua Franca[
Scale = 1.0 ,
Ligatures = Common ,
Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
Language = Japanese,
Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument


¿Es 日本 Utopía?



This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 20 at 22:13

























answered Jan 20 at 4:11









DavislorDavislor

6,1221227




6,1221227












  • I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

    – David Carlisle
    Jan 20 at 22:59











  • @DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

    – Davislor
    Jan 21 at 0:01












  • @DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

    – Davislor
    Jan 21 at 0:12












  • No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

    – David Carlisle
    Jan 21 at 0:46











  • @DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

    – Davislor
    Jan 21 at 1:09

















  • I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

    – David Carlisle
    Jan 20 at 22:59











  • @DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

    – Davislor
    Jan 21 at 0:01












  • @DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

    – Davislor
    Jan 21 at 0:12












  • No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

    – David Carlisle
    Jan 21 at 0:46











  • @DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

    – Davislor
    Jan 21 at 1:09
















I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59





I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59













@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01






@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01














@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12






@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12














No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46





No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46













@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09





@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09











10














Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by



usepackage[T1][fontenc}


so it is simplest to tell people not to use fontenc.



In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.



The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.






share|improve this answer

























  • Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

    – moewe
    Jan 20 at 12:51












  • @moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

    – David Carlisle
    Jan 20 at 13:25











  • I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

    – moewe
    Jan 20 at 13:36






  • 1





    @moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

    – David Carlisle
    yesterday






  • 1





    @moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

    – Ulrike Fischer
    yesterday















10














Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by



usepackage[T1][fontenc}


so it is simplest to tell people not to use fontenc.



In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.



The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.






share|improve this answer

























  • Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

    – moewe
    Jan 20 at 12:51












  • @moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

    – David Carlisle
    Jan 20 at 13:25











  • I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

    – moewe
    Jan 20 at 13:36






  • 1





    @moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

    – David Carlisle
    yesterday






  • 1





    @moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

    – Ulrike Fischer
    yesterday













10












10








10







Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by



usepackage[T1][fontenc}


so it is simplest to tell people not to use fontenc.



In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.



The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.






share|improve this answer















Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by



usepackage[T1][fontenc}


so it is simplest to tell people not to use fontenc.



In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.



The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 20 at 12:34

























answered Jan 20 at 11:47









David CarlisleDavid Carlisle

489k4111311877




489k4111311877












  • Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

    – moewe
    Jan 20 at 12:51












  • @moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

    – David Carlisle
    Jan 20 at 13:25











  • I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

    – moewe
    Jan 20 at 13:36






  • 1





    @moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

    – David Carlisle
    yesterday






  • 1





    @moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

    – Ulrike Fischer
    yesterday

















  • Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

    – moewe
    Jan 20 at 12:51












  • @moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

    – David Carlisle
    Jan 20 at 13:25











  • I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

    – moewe
    Jan 20 at 13:36






  • 1





    @moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

    – David Carlisle
    yesterday






  • 1





    @moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

    – Ulrike Fischer
    yesterday
















Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51






Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51














@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25





@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25













I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36





I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36




1




1





@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday





@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday




1




1





@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday





@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

















draft saved

draft discarded
















































Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f470976%2fare-there-cases-where-fontenc-luatex-or-xetex-cause-problems%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?