Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expression operators for locale matching (system languages) #6197

Open
1ec5 opened this issue Feb 20, 2018 · 9 comments
Open

Add expression operators for locale matching (system languages) #6197

1ec5 opened this issue Feb 20, 2018 · 9 comments
Labels
cross-platform 📺 Requires coordination with Mapbox GL Native (style specification, rendering tests, etc.) feature 🍏

Comments

@1ec5
Copy link
Contributor

1ec5 commented Feb 20, 2018

There should be a simple way for the style author to specify that a text-field should be set to the name_* feature property that best fits the system’s preferred languages. Secondarily, it would be great if the most appropriate locale could be used on its own in expressions.

Motivation

Localizing a style’s labels currently entails iterating over all the layers, manually replacing references to name_* feature properties within each text-field value. If these values are expressions, replacing the references can be an involved, recursive step. The iOS and macOS map SDKs have a built-in option, MGLStyle.localizesLabels, that applies these changes automatically based on the system language and region preferences. There’s a plugin for GL JS and a forthcoming plugin for the Android map SDK (mapbox/mapbox-plugins-android#74) that do likewise.

While this approach is effective, it operates at such a high level that the localizing code doesn’t have a good way to reason about the style author’s intentions. Should {name} ({name_en}) be replaced by {name_es} ({name_en}) or just {name}? The style’s author has no opportunity to react to changes that could radically alter the style’s appearance, for instance by increasing the font size when the system language is Chinese. Moreover, the localization feature implicitly opts the map into runtime styling–specific behaviors like disabling automatic style refreshes.

Design

The style specification would be extended with two expression operators:

  • user-locales takes no arguments and evaluates to an array of locale identifiers corresponding to the user’s preferences.
  • match-locales has the signature ["match-locales", inputLocales, availableLocales] and evaluates to the item in availableLocales (an unordered array of locale identifiers) that corresponds to the first item in inputLocales (e.g., user-locales) that matches one of availableLocales.

For the purposes of these operators, a locale identifier could include a language code, script code, or region code, or some combination thereof. I would be in favor of specifying BCP 47 as the locale identifier standard to follow.

In typical usage, a style author would opt into localization by setting text-field to a value such as:

[
  "let",
  "streets-languages", ["ar", "de", "en", "es", "fr", "pt", "ru", "zh", "zh-Hans"],
  [
    "coalesce",
    ["concat", "name_", ["match-locales", ["user-locales"], ["var", "streets-languages"]]],
    "name"
  ]
]

Meanwhile, ["at", 0, ["user-locales"]] could be used on its own as part of a number formatting operator (#4119) and a case- and diacritic-folding string comparison operator (#4136).

Design alternatives

It’s unfortunate that streets-languages would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specific name fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator, mapbox-streets-languages.

It might be tempting to rely on match as an alternative to match-locales; however, locale identifier matching rules are rather complicated. For example, for the set of languages supported by the Streets source, en-US should resolve to en, zh-TW should resolve to zh, and zh-Hans-TW should resolve to zh-Hans.

Implementation

  • In GL JS, user-locales would be implemented by returning navigator.languages. iOS/macOS would use +[NSLocale preferredLanguages].
  • For match-locales, GL JS could use locale-utils for this purpose. iOS/macOS would use +[NSBundle preferredLocalizationsFromArray:forPreferences:].

/ref mapbox/mapbox-gl-native#10713 (comment)
/cc @mapbox/gl-core @fabian-guerra @tobrun @langsmith @nickidlugash @bsudekum

@1ec5 1ec5 added feature 🍏 cross-platform 📺 Requires coordination with Mapbox GL Native (style specification, rendering tests, etc.) labels Feb 20, 2018
@langsmith
Copy link

langsmith commented Feb 21, 2018

fyi @cammace ☝️

@ChrisLoer
Copy link
Contributor

I'm toying with the idea of implementing this. I think the design makes sense, although the symbol-layer-verbosity problem is annoying.

As I mentioned in #6270 (comment), I wonder if BCP 47 gives us more information than we want/need. If we restrict locale specifications to ISO 639-1 codes, we probably don't even need locale-utils (saving code size, but more importantly semi-hidden complexity), and we have a simpler input to platform-specific APIs that may not speak BCP 47. On the other hand, we'd give up being able to choose number formatters based on country...

@1ec5
Copy link
Contributor Author

1ec5 commented Mar 24, 2018

There’s already a need for more than ISO 639-1: many languages only have ISO 639-2 codes, not ISO 639-1 codes, and a few major languages like Chinese often need to be qualified by an ISO 15924 script code or ISO 3166 country code, such as for label localization. For example, the Mapbox Streets source distinguishes between zh and zh-Hans, leaving open the possibility of distinguishing zh-Hant in the future.

@ChrisLoer
Copy link
Contributor

@1ec5 🤔 How about two arguments, language + (optional) region:

This would not support script customization (e.g. Hans vs Hant and wow I just realized the s in hans was for "simplified"), or the variant options in BCP 47. Again, the motivation is maximum cross-platform compatibility:

  • ICU Locale: Takes 639 language tag and 3166 region tag. Optional "variant" argument defined as "vendor and browser-specific". Supports script lookup, but not setting script for a locale.
  • iOS/macOS locale IDs: 639 language tag + 3166 region tag + 15924 script tag.
  • Android Locale. Android basically just uses BCP 47.
  • Qt Locale basically uses BCP 47, but it looks like we'll probably have to use ICU directly on Qt so it probably doesn't matter.

@1ec5
Copy link
Contributor Author

1ec5 commented Mar 26, 2018

Separating the language and region into two arguments gives us less flexibility to support more locale information (such as script codes) in the future. I think it would be more forward-compatible if each locale-aware operator accepts a single locale code argument; each operator would decide for itself how specific a code it would honor. For example, locale matching needs to respect script differences, but perhaps string comparison does not.

@1ec5
Copy link
Contributor Author

1ec5 commented Aug 28, 2018

It’s unfortunate that streets-languages would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specific name fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator, mapbox-streets-languages.

As of mapbox/tilejson-spec#42, TileJSON 3.0 will formally declare a vector_layers property that enumerates the layers and their fields. While the specification doesn’t provide a way to explicitly state the language of each field, I think it would be fine to assume name_* fields are of the form name_{ISO 639}, which would be no less robust than hard-coding language fields in the style or SDK.

@andrewharvey
Copy link
Collaborator

While I'm overall very positive about this change, it should still support user overrides to the locale. eg. My browser might be set to English, but I want to build in a button on my site that will swap the map to German, regardless of my browser setting.

@1ec5
Copy link
Contributor Author

1ec5 commented Sep 7, 2018

My browser might be set to English, but I want to build in a button on my site that will swap the map to German, regardless of my browser setting.

That could be implemented via an API such as setLabelLanguage().

@1ec5
Copy link
Contributor Author

1ec5 commented Mar 2, 2020

The style’s author has no opportunity to react to changes that could radically alter the style’s appearance, for instance by increasing the font size when the system language is Chinese.

It’s unfortunate that streets-languages would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specific name fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator, mapbox-streets-languages.

Per mapbox/mapbox-gl-native#15659 and mapbox/mapbox-gl-native#14470 (comment), knowing the language contained in each layer of the Streets source would allow GL to choose the appropriate font for a given character without forcing the developer to specify font overrides. The locale matching proposed here would help to associate that information with the fonts specified in the stylesheet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cross-platform 📺 Requires coordination with Mapbox GL Native (style specification, rendering tests, etc.) feature 🍏
Projects
None yet
Development

No branches or pull requests

4 participants