Skip to content

Latest commit

 

History

History
307 lines (264 loc) · 15 KB

readme.md

File metadata and controls

307 lines (264 loc) · 15 KB

Fontconfig is a library that many graphical programs use to figure out what font to use. A program can ask Fontconfig for a font matching a pattern and Fontconfig will return a font which may or may not be "anything like the requested pattern". The important thing is that we can mess with this matching process using configuration files—for example to set default fonts in a way that will work across many programs.

The fc-match(1) utility that ships with Fontconfig can be used to test what fonts are returned for a given pattern:

$ fc-match
Ubuntu-R.ttf: "Ubuntu" "Regular"
$ fc-match serif
RobotoSlab-Regular.ttf: "Roboto Slab" "Regular"

Since fc-match uses "the normal fontconfig matching rules", the above output implies that (with my configuration) a program that (reasonably) wants to use the most default font possible will use Ubuntu, and a program that wants the default serif font will use Roboto Slab.

Configuration

I use members of the Ubuntu family as the default sans-serif and monospace typefaces and Roboto Slab as the default serif one. Google's Noto font family is the fallback for missing characters and emoji.

There are some nuances that make this more tricky than it sounds.

Major tangent: Han unification

If your system is configured well and has the necessary fonts installed, you will see two similar but non-identical Han characters here:

Unicode does not. It assigns the same code point (number) to both characters. The only reason they (hopefully) look distinct is that I added lang attributes to the list items. Those two characters can only coexist when additional metadata is provided—possible on a webpage, but try copying both characters into your browser's address bar, a text editor, or a terminal: I bet they'll look the same.[1]

Relying on additional metadata for correct rendering of text seems like a weird choice in hindsight, but the consequence that's relevant here is this: choosing a fallback font also determines which variant of some Han characters will appear in contexts that lack language metadata. Plainly using Noto Serif or Noto Sans apparently means that the Japanese kanji forms are used. I suppose this is because the Noto fonts for Japanese are alphabetically first among their respective groups of language-specific Noto CJK fonts:

$ fc-match -a sans-serif | grep '"Noto Sans CJK .*" "Regular"'
NotoSansCJK-Regular.ttc: "Noto Sans CJK JP" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK KR" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK SC" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK TC" "Regular"

Curiously, each font "does support all four languages and includes the complete set of glyphs". Notice how the above four fonts even all resolve to the same one file, NotoSansCJK-Regular.ttc. A special OpenType feature allows programs that support it to "access language-specific variants other than the default language". (I guess getting a glyph from an OpenType font file is much more complicated than just asking for a code point.)

Anyway. I want traditional Chinese characters when no metadata is available, so I'm using Noto Serif CJK TC and Noto Sans CJK TC.

Back to topic (sort of)

It's good to know the order in which Fontconfig loads configuration files. There usually are lots in /etc/fonts/conf.d/ and they interfere with user-specific configuration. The only explanation I've found is in the Tuning Fontconfig section of Beyond Linux From Scratch: files in /etc/fonts/conf.d/ have names starting with a two-digit number followed by a hyphen and smaller numbers are loaded first.

Loading files from the configuration paths specified by fonts-conf(5) isn't intrinsic behavior of Fontconfig. Instead, the master /etc/fonts/fonts.conf file contains <include> directives. On my system,[2] it only includes files in /etc/fonts/conf.d/, but in there is 50-user.conf which includes (among other things) ~/.config/fontconfig/fonts.conf. The takeaway is that the user-specific configuration here is loaded sort of after one half and before one half of the system-wide configuration files.

My configuration file started off based on the one in this section of the Fonts ArchWiki article. The important parts are <alias> elements such as:

<alias>
   <family>sans-serif</family>
   <prefer>
      <family>Ubuntu</family>
      <family>Noto Sans CJK TC</family>
      <family>Noto Color Emoji</family>
      <family>Noto Sans</family>
   </prefer>
</alias>

The element says: prepend those four font families to the list of best-matching fonts in that order when "sans-serif" is requested. My fonts.conf consists of such <alias> elements for "serif", "sans-serif", and "monospace".[3]

This works but I ran into one problem. Something else was also prepending "Noto Sans" with the effect that it ended up at the very top of the sans-serif font list. The same thing happened for serif and monospace fonts. I identified 30-infinality-aliases.conf, which I got from the fonts-meta-extended-lt package, as the culprit. It does this:

<alias>
   <family>sans-serif</family>
   <prefer><family>Noto Sans</family></prefer>
</alias>

But wait! How can 30-infinality-aliases.conf override an alias that is ultimately included from 50-user.conf? Well, there are two ways in which one may prepend fonts with Fontconfig and <prefer>ing is syntactic sugar for inserting before the matching <family> but not actually at the top. 30-infinality-aliases.conf does this before my own configuration and consequently it wins.[4] I forked 30-infinality-aliases.conf and removed the problematic lines.

We can test the results with the -s flag of fc-match:

$ fc-match -s serif | head -4
RobotoSlab-Regular.ttf: "Roboto Slab" "Regular"
NotoSerifCJK-Regular.ttc: "Noto Serif CJK TC" "Regular"
NotoColorEmoji.ttf: "Noto Color Emoji" "Regular"
NotoSerif-Regular.ttf: "Noto Serif" "Regular"
$ fc-match -s sans-serif | head -4
Ubuntu-R.ttf: "Ubuntu" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK TC" "Regular"
NotoColorEmoji.ttf: "Noto Color Emoji" "Regular"
NotoSans-Regular.ttf: "Noto Sans" "Regular"
$ fc-match -s monospace | head -3
UbuntuMono-R.ttf: "Ubuntu Mono" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans Mono CJK TC" "Regular"
NotoSansMono-Regular.ttf: "Noto Sans Mono" "Regular"

🙂

Sources

Here are most of the articles and other resources that I referenced, as well some more that are relevant and interesting:

Fontconfig

Unicode

Other

Footnotes

  1. There may be another way: "Variation Selector format characters [...] are used to specify a specific glyph variant for a Unicode character, such as the Japanese, Chinese, Korean, or Taiwanese form of a particular CJK ideograph."
  2. Did I tell you I use Arch Linux?
  3. The semantics of Fontconfig's XML schema are documented in fonts-conf(5).
  4. I think 30-infinality-aliases.conf disregards the conventional naming scheme in doing so: "generic aliases" should appear in files with numbers 60 to 69 (see the various files section).