Are there cases where fontenc + luatex (or xetex) cause problems?

Clash Royale CLAN TAG#URR8PPP

According to the luatex docs, you shouldn't use fontenc with luatex. People swear up and down that fontenc is incompatible, but I haven't been able to find an example where loading the package causes problems.
(I'm curious about this because it can be easier to load the same base set of packages for pdftex/luatex/xetex, and add fontspec only when handling the latter two.)

I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

2

Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21

examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54

add a comment |

I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

2

Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21

examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54

add a comment |

I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

I know fontenc is not the right way to deal with fonts in luatex or xelatex, but I'm specifically looking for cases where it's detrimental to load the fontenc package. Do you know of any?

fonts xetex luatex

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

edited Jan 20 at 9:10

Joseph Wright♦

203k22559885

asked Jan 20 at 3:10

karldw

884

asked Jan 20 at 3:10

karldw

884

asked Jan 20 at 3:10

karldw

884

2

Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21

examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54

add a comment |

2

Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21

examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54

Great first question! I learned some new things by researching the answer. Thanks.

– Davislor
Jan 20 at 4:21

examples how to use both traditional font encodings and OpenType fonts in same document are given there

– user4686
Jan 20 at 8:54

add a comment |

3 Answers
3

active

oldest

votes

fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.

But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument

will load t1enc.def, lgrenc.def and t2enc.def.

This also is not problematic with lualatex and xelatex.

But fontenc will also set the last encoding option as the encoding default. And quite a number of encodings are not suitable for lualatex and xelatex. These engines need the TU encoding (fontspec sets this encoding). Other encodings can lead to quite wrong outputs:

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
 Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
 Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
 Grüße, αβγ, Ҍҋ

enddocument

enter image description here

So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:

documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A

Moving the setmainfont resolves the problem:

documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

1

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

2

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

|
show 6 more comments

An Example that Might Bite You

It can cause problems if you load both fontspec and fontenc together. More precisely, as David Carlisle points out, if you combine Unicode with other encodings in the same document—which could happen without your being aware that you loaded both, or even on a document that worked before. Here is an example that loads the legacy Utopia font, which is T1-encoded, but then also tries to load a modern Unicode font through Babel.

documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

Permuting the order in which you load packages can give you many different bugs. One of several problems in this example is that fontspec renders all non-ASCII characters inactive, which prevents them from being correctly translated into other encodings. If you re-ordered commands so that you loaded setbabelfont after fourier, you would instead set the main font to Latin Modern Roman.

The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.

How to Combine Unicode and Legacy Fonts

I’m not judging. Sometimes I don’t get to set the requirements.

To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:

documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase 
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

A Better Solution

A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:

documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX 
setmainfontLingua Franca[
 Scale = 1.0 ,
 Ligatures = Common ,
 Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

add a comment |

Note that it is not loading fontenc that is incompatible (fontspec loads fontenc) it is using font encodings other than TU (Unicode). So fontecodingT1selectfont is the real problem, although that is most commonly activated by

usepackage[T1][fontenc}

so it is simplest to tell people not to use fontenc.

In addition to the incorrect characters shown in the other answers, even when you get the correct characters, with the xetex and xelatex formats as distributed, hyphenation will be incorrect as only the TU hyphenation patterns are loaded. You can not load hyphenation patterns into a normal document, only when making the format. So setting things up to get correct hyphenation with T1 (or T2 or LGR...) encoded fonts is tricky, not well supported by language packages and will produce documents that will silently produce the wrong results if processed at another site which does not have the custom formats set up.

The situation is different with luatex which can load new hyphenation patterns as a result of declarations in the document, but it is still tricky to get right and in almost all cases it is simpler to use a Unicode encoded font.

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

1

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

1

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

|
show 4 more comments

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "85"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f470976%2fare-there-cases-where-fontenc-luatex-or-xetex-cause-problems%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.

But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument

will load t1enc.def, lgrenc.def and t2enc.def.

This also is not problematic with lualatex and xelatex.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
 Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
 Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
 Grüße, αβγ, Ҍҋ

enddocument

enter image description here

So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:

documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A

Moving the setmainfont resolves the problem:

documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

1

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

2

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

|
show 6 more comments

fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.

But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument

will load t1enc.def, lgrenc.def and t2enc.def.

This also is not problematic with lualatex and xelatex.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
 Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
 Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
 Grüße, αβγ, Ҍҋ

enddocument

enter image description here

So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:

documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A

Moving the setmainfont resolves the problem:

documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

1

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

2

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

|
show 6 more comments

fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.

But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument

will load t1enc.def, lgrenc.def and t2enc.def.

This also is not problematic with lualatex and xelatex.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
 Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
 Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
 Grüße, αβγ, Ҍҋ

enddocument

enter image description here

So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:

documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A

Moving the setmainfont resolves the problem:

documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

fontenc is loaded by fontspec (you can check this in the log). So in itself the package is not a problem.

But fontenc is a special package: You can load it more than once with different options without getting option clash errors. It will then load font encoding definitions for all the options. E.g.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
begindocument
encodingdefault, makeatletter f@encodingmakeatother

enddocument

will load t1enc.def, lgrenc.def and t2enc.def.

This also is not problematic with lualatex and xelatex.

documentclassarticle
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
usepackagefontspec
setmainfontDejaVuSans
begindocument
encodingdefault, makeatletter f@encodingmakeatother

Grüße, αβγ, Ҍҋ

fontencodingT1selectfont
 Grüße, αβγ, Ҍҋ

fontencodingLGRselectfont
 Grüße, αβγ, Ҍҋ

fontencodingT2Aselectfont
 Grüße, αβγ, Ҍҋ

enddocument

enter image description here

So you can use fontenc in your document (I need it to use chessfonts), but you should be careful to load it so that TU remains the default encoding. This here e.g. is wrong:

documentclassarticle
usepackagefontspec
setmainfontDejaVuSans
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
%
begindocument% wrong, encoding is T2A

Moving the setmainfont resolves the problem:

documentclassarticle
usepackagefontspec
usepackage[T1]fontenc
usepackage[LGR]fontenc
usepackage[T2A]fontenc
setmainfontDejaVuSans
begindocument %encoding is TU now

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

edited Jan 23 at 9:53

Joseph Wright♦

203k22559885

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

answered Jan 20 at 8:40

Ulrike Fischer

191k8298680

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

1

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

2

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

|
show 6 more comments

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

1

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

2

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

The solution edited in at the bottom does not work correctly. The last three lines display as: Grüße, , ΓρῤΫε, , GrьЯe, ,. This is because non-ASCII characters are rendered inactive.

– Davislor
Jan 20 at 12:14

I took the liberty of replacing the fix at the end with one that really does work. Mostly.

– Davislor
Jan 20 at 13:38

@Davislor sorry no, your edit is wrong. I neither recommend luainputenc nor utf8x nor all your additions. I reject this edit.

– Ulrike Fischer
Jan 20 at 13:41

As written, it appeared to me to be saying that making that change to your first example would allow it to compile correctly. Since that was not your intent, you might want to clarify which problem it resolves.

– Davislor
Jan 20 at 14:01

@Davislor the question is about loading of fontenc, not about loading of arbitrary font packages. Please let the OP decide which answer he likes and understands.

– Ulrike Fischer
Jan 20 at 14:10

|
show 6 more comments

An Example that Might Bite You

documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.

How to Combine Unicode and Legacy Fonts

I’m not judging. Sometimes I don’t get to set the requirements.

To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:

documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase 
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

A Better Solution

A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:

documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX 
setmainfontLingua Franca[
 Scale = 1.0 ,
 Ligatures = Common ,
 Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

add a comment |

An Example that Might Bite You

documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.

How to Combine Unicode and Legacy Fonts

I’m not judging. Sometimes I don’t get to set the requirements.

To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:

documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase 
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

A Better Solution

A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:

documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX 
setmainfontLingua Franca[
 Scale = 1.0 ,
 Ligatures = Common ,
 Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

add a comment |

An Example that Might Bite You

documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.

How to Combine Unicode and Legacy Fonts

I’m not judging. Sometimes I don’t get to set the requirements.

To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:

documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase 
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

A Better Solution

A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:

documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX 
setmainfontLingua Franca[
 Scale = 1.0 ,
 Ligatures = Common ,
 Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

An Example that Might Bite You

documentclass[varwidth, preview]standalone
usepackage[spanish]babel

% Due to a bug in Babel 3.22, we must override the OpenType
% language and script features for Japanese, and several other
% languages.
babelprovide[language=Japanese, script=Kana]japanese

% Implicitly causes babel to load fontspec:
babelfont[japanese]rmNoto Sans CJK JP

% Implicitly loads fontenc with [T1]:
usepackage[poorman]fourier

begindocument
¿Es foreignlanguagejapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

The rest of my post is about how to get that broken example to work, so if you only cared about the example of something fontenc breaks, you can stop reading.

How to Combine Unicode and Legacy Fonts

I’m not judging. Sometimes I don’t get to set the requirements.

To fix this example, load luainputenc, which, despite the misleading name, also allows switching between Unicode and legacy encodings on output:

documentclass[varwidth, preview]standalone
usepackage[T1]fontenc
usepackagetextcomp
usepackage[utf8]luainputenc % Needed to mix NFSS and Unicode
usepackage[spanish]babel
usepackage[no-math]fontspec

defaultfontfeatures Scale = MatchUppercase 
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

usepackage[poorman]fourier

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

A Better Solution

A quick Web search revealed that there are several free OTF versions of Utopia, which is legal because Adobe released a free and modifiable version years ago. Here, I load Lingua Franca:

documentclass[varwidth, preview]standalone
usepackagepolyglossia

setdefaultlanguagespanish

defaultfontfeatures Scale = MatchUppercase, Ligatures = TeX 
setmainfontLingua Franca[
 Scale = 1.0 ,
 Ligatures = Common ,
 Numbers = OldStyle ]
newfontfamilyjapanesefontNoto Serif CJK JP[
 Language = Japanese,
 Script = Kana ]

newcommandtextjapanese[1]japanesefont #1

begindocument
¿Es textjapanese日本 Utopía?
enddocument

¿Es 日本 Utopía?

This is much less of a hack and supports several features and scripts that the legacy package does not. You should use Unicode when you can, and legacy encodings when you have to.

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

edited Jan 20 at 22:13

answered Jan 20 at 4:11

Davislor

6,1221227

answered Jan 20 at 4:11

Davislor

6,1221227

answered Jan 20 at 4:11

Davislor

6,1221227

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

add a comment |

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

I am not sure what you mean by "One of several problems in this example is that fontspec renders all non-ASCII characters inactive, " (active/inactive characters are a matter of input, and fontspec does not affect the input encoding at all)

– David Carlisle
Jan 20 at 22:59

@DavidCarlisle Okay, here’s my understanding. The way LaTeX handles legacy encodings is: some characters are supposed to be the same as ASCII in every text encoding, and are just passed through. LGR breaks this assumption, but is intentionally laid out so that ASCII/LGR mojibake gives you a close enough transliteration that a human can figure it out, similar to Γρεεκ. As you know, the first 127 characters of Unicode are also the same as ASCII, and the first 256 the same as ISO Latin-1, so this still works for any characters that are the same in the font encoding.

– Davislor
Jan 21 at 0:01

@DavidCarlisle Other characters, such as the ¿ in my example, do not have the same encoding as in Unicode, so they need to be set active in order to work. When the current encoding is OT1 or T1, IIRC, ¿ would be set active and either mapped to the commandtextquestiondown, or the slot in a specific encoding. Loading fontspec and enabling the TU encoding turns this off, so selecting any 8-bit encoding gives you mojibake. Loading luainputenc turns it back on.

– Davislor
Jan 21 at 0:12

No that's misleading, fontenc never makes any characters active or inactive, that is the job of inputenc (in classic tex) and although the character numbers 127-256 are the same in utf-8 they take two bytes not one, so in pdftex (or in luatex if you load luainputenc and disable the native unicode support) the characters above 127 have to be active, specified as usepackage[latin1]inputenc or usepackage[utf8]inputenc or whatever encoding is in use. So ¿ is not non-active because you loaded fontenc, it is because you haven't loaded inputenc (and inputenc doesn't work in luatex)

– David Carlisle
Jan 21 at 0:46

@DavidCarlisle I’m open to suggestions for how to re-word that passage. What I’m trying to convey in my answer is that. if you load fontenc but not fontspec, LaTeX3 will make some non-ASCII Unicode characters active within the body of the document, even if you don’t explicitly load inputenc or selinput. If you load both fontenc and fontspec, these characters will not be activated and some of them will break.

– Davislor
Jan 21 at 1:09

add a comment |

usepackage[T1][fontenc}

so it is simplest to tell people not to use fontenc.

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

1

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

1

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

|
show 4 more comments

usepackage[T1][fontenc}

so it is simplest to tell people not to use fontenc.

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

1

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

1

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

|
show 4 more comments

usepackage[T1][fontenc}

so it is simplest to tell people not to use fontenc.

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

usepackage[T1][fontenc}

so it is simplest to tell people not to use fontenc.

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

edited Jan 20 at 12:34

answered Jan 20 at 11:47

David Carlisle

489k4111311877

answered Jan 20 at 11:47

David Carlisle

489k4111311877

answered Jan 20 at 11:47

David Carlisle

489k4111311877

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

1

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

1

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

|
show 4 more comments

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

1

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

1

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

Do you know an example where the hyphenation goes wrong due to fontenc and T1? I tried to come up with an example myself, but surprisingly XeLaTeX and LuaLaTeX performed better(!) than pdfLaTeX in the following example gist.github.com/moewew/cfe4f8e18c659665eaaca12e7fe44730

– moewe
Jan 20 at 12:51

@moewe well for english of course it's largely the same but for any accented letter the hyphenation tables will be nonsense for T1 encoded fonts

– David Carlisle
Jan 20 at 13:25

I suspected that accented letters would be the interesting ones, so I tried German words with umlauts. But apart from the "SS"/"ß" issue I could not see a difference in most words I tried. The ones in the example above are the only differences I could find, but they make pdfLaTeX look bad... I thought that maybe your infamous foreign language skills would have something in store ;-)

– moewe
Jan 20 at 13:36

@moewe with grüßen it will be looked up as intended in the unicode hyphenation tables, but as the font isn't in that encoding you get essentially random characters. For English it's fine, for French it's OK, for German you can get by with SS but for any languages in the latin2 range where T1 and Unicode are very different you will get unreadable nonsense.

– David Carlisle
yesterday

@moewe well naturally it gets better with grüßen - in this case you are using the char which is at the position the patterns expect the ß. But this means that you have to choose if you want good hyphenation with bad output (grüßen) or bad hyphenation with good output (grüss en).

– Ulrike Fischer
yesterday

|
show 4 more comments

draft saved

draft discarded

Thanks for contributing an answer to TeX - LaTeX Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu

Are there cases where fontenc + luatex (or xetex) cause problems?

3 Answers
3

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

Are there cases where fontenc + luatex (or xetex) cause problems?

3 Answers 3

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

An Example that Might Bite You

How to Combine Unicode and Legacy Fonts

A Better Solution

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

3 Answers
3

3 Answers
3

3 Answers
3