77
This talk lasts ӣړ܈Localisation is easy

This talk lasts 三十分钟

Embed Size (px)

Citation preview

Page 1: This talk lasts 三十分钟

This talk lasts Localisation is easy

Page 2: This talk lasts 三十分钟

Administrative Notes

Page 3: This talk lasts 三十分钟

• @pilif on twitter

• pilif on github

• working at Sensational AG

Page 4: This talk lasts 三十分钟

• @pilif on twitter

• pilif on github

• working at Sensational AG

• warming up to shirts

Page 5: This talk lasts 三十分钟

Thanks Richard for the Recording

Page 6: This talk lasts 三十分钟

About that 💩

Page 7: This talk lasts 三十分钟

Maybe ES6…?

Page 8: This talk lasts 三十分钟

My host name is a horrible spoiler if you're into JRPGs. Disregard

Page 9: This talk lasts 三十分钟

however…

Page 10: This talk lasts 三十分钟
Page 11: This talk lasts 三十分钟
Page 12: This talk lasts 三十分钟

close enough.

Page 13: This talk lasts 三十分钟

Back to the topic at hand

Page 14: This talk lasts 三十分钟

Let’s talk terms

• Language is a language as it is spoken or written

• Locale is the name given to a set of parameters that define how things should be done for users speaking a certain language in a certain place

• There are many more locales than countries

Page 15: This talk lasts 三十分钟

Locale

• Locales consist of a language…

• … and a country

• … and sometimes specific variants

Page 16: This talk lasts 三十分钟

Specifying locales

• IETF BCP-47 document

• See RFC 5646 and RFC 4647

• Use language-script-territory@modifier

• POSIX uses language_territory.encoding@modifier

Page 17: This talk lasts 三十分钟

fr-Latn-CH

Page 18: This talk lasts 三十分钟

fr-CH

Page 19: This talk lasts 三十分钟

fr_CH.utf-8

Page 20: This talk lasts 三十分钟

The Locale affects many things

Page 21: This talk lasts 三十分钟

Number formatting• Probably the most obvious of the bunch.

• Decimal separator

• Thousands separator

• Sign

• Also: Currency information

Page 22: This talk lasts 三十分钟

Some Samples

de-CH de-DE en-US

decimal separator . , .

thousands separator ' . ,

Page 23: This talk lasts 三十分钟

12,435

en-US twelve thousand four hundred and thirty five

de-DE twelve comma four three five

de-CH error

Page 24: This talk lasts 三十分钟

Date Formatting

• Obviously names of months and weekdays

• Order of distinct parts

• Separator character

• Commonly used formats in different contexts

Page 25: This talk lasts 三十分钟

Date Formatting• Libraries usually provide a generic short/

medium/long format

• Libraries also provide templates

• If your library’s template language has any characters that are not for replacement, they are doing it wrong

• Apple does it right since 10.11 and iOS9

Page 26: This talk lasts 三十分钟

2015-07-18 17:47Long Medium Short

en-US July 18, 2015 at 4:58:00 PM CEST

Jul 18, 2015, 4:58:00 PM 7/18/15, 4:58 PM

fr-CA 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 15-07-18 16:58

fr-CH 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 18.07.15 16:58

fr-FR 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 18/07/2015 16:58

Page 27: This talk lasts 三十分钟

Choice of calendar• Most of the world is using the Gregorian

calendar

• The Julian calendar uses the same month names but is off by 13 days (they have July 5th right now)

• Other calendars use different month names

• Might affect holiday calculations

Page 28: This talk lasts 三十分钟

Collation order

• How to compare to strings. Which one is first?

• Where to put the characters with pesky accents?

• How to deal with case differences?

• What about non-latin scripts?

Page 29: This talk lasts 三十分钟

Collation fun*• Phonebook german vs. ordinary german, vs.

Austrian german (dealing with umlauts)

• Contractions (Spanish ch counts as one letter, ch in Czech sorts after h, but c after b, etc)

• Handling of accents is language-dependent

• Case insensitive is a mess

Page 30: This talk lasts 三十分钟

Case folding• Some languages don’t differentiate between upper- and

lowercase

• Inconsistent mapping between upper- and lowercase (ß => SS, the reverse is not always true)

• Uppercasing accented characters is language (and sometimes locale) dependent. French characters often loose accents when uppercasing

• Inconsistent uppercasing for some languages (uppercase turkish i is İ. Lowercase turkish I is ı)

Page 31: This talk lasts 三十分钟

Double the fun• Collation and Case-Folding provide an interesting

team

• Depending on locale, upper- and lowercase should be sorted together or apart

• In some locales, case doesn’t matter at all when sorting

• In some locales, case always matters when sorting

• Depends on the use-case

Page 32: This talk lasts 三十分钟

Collation strength

• icu created the concept of “collation strength”

• strength 1 is the most lenient

• strength 5 is the most exact

• Example: Strength 2 removes accents unless the language is Danish

Page 33: This talk lasts 三十分钟

‘nough said

RTL

Page 34: This talk lasts 三十分钟

Perspectives matter

Page 35: This talk lasts 三十分钟

Context matters• “This slide lasts one minute”

• “This talk lasts 30 minutes”

• “Lunch lasted 1:30 hours”

• “Tomorrow I’ll sleep in”

• “August, 1th is a national holiday”

Page 36: This talk lasts 三十分钟

Let’s get practical

Page 37: This talk lasts 三十分钟

Locale handling is like escaping

• Always store raw unformatted data

• Format near the end of the chain

• Just before you escape

• Parse user input as early as possible

• Use native data types

Page 38: This talk lasts 三十分钟

UI Language is not locale

• Users might prefer to use the os in a different language than what’s inferred by their locale

• Just because I’m in de_CH it doesn’t mean I want your software to speak german to me

• UI language is completely different from the users locale

Page 39: This talk lasts 三十分钟

Avoid this mess

Page 40: This talk lasts 三十分钟

Avoid this mess

Page 41: This talk lasts 三十分钟

Avoid this mess

Page 42: This talk lasts 三十分钟

Mixing Locales• Forming sentences in UI language with locale formatted

data is… challenging

• Be mindful that language might influence some locale formatting.

• “This talk lasts ”

• or rather “This talk lasts 30 minutes”

• It depends. Does the locale also use hours and minutes?

Page 43: This talk lasts 三十分钟

Never be helpful* and translate units

Page 44: This talk lasts 三十分钟

1kg in de_CH is not 1lbs in en_US

Page 45: This talk lasts 三十分钟

Btw: Apple’s APIs are really good at this

Page 46: This talk lasts 三十分钟

What about web sites?

• Never, ever infer UI language by IP Geolocation.

People from Google: This slide is for you!

Page 47: This talk lasts 三十分钟

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

People from Google: This slide is for you!

Page 48: This talk lasts 三十分钟

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

• Promise!

People from Google: This slide is for you!

Page 49: This talk lasts 三十分钟

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

• Promise!

• You may infer Locale from IP Geolocation though

People from Google: This slide is for you!

Page 50: This talk lasts 三十分钟

Rely on HTTP• Trust Accept-Language - by now browser set

it correctly

• Use the header to determine UI language

• Use the header to determine default locale

• But ask the user

• Same goes for time zones

Page 51: This talk lasts 三十分钟

SHOW ME SOME CODE ALREADY!!!

Page 52: This talk lasts 三十分钟

The past• There has always been date formatting

(Date.toLocaleString). Mostly useless

• People were self-nebling (search youtube for “ich neble selber”) for example in date pickers and libraries

• hint: applying substr() to Date.toDateString() is not a correct solution.

• same goes for using replace(‘.’, ‘,’) on a number

Page 53: This talk lasts 三十分钟

The present• Microsoft has donated a huge chunk of localisation code to the

jQuery project.

• It’s not integrated into jQuery, but maintained by the jQuery project

• Check out https://github.com/jquery/globalize

• Doesn’t support collation

• The library is big

• But most of it is data and this problem can only be solved with a huge database of special cases

Page 54: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 55: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 56: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 57: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 58: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 59: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 60: This talk lasts 三十分钟

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Page 61: This talk lasts 三十分钟

The future• ECMA-402 from 2012

• Yes. Specs from 2012 are “the future” in JS land

• Provides the global Intl object

• Date, Number formatting and Collation

• see: http://www.ecma-international.org/ecma-402/1.0/

Page 62: This talk lasts 三十分钟

Could be worse

Page 63: This talk lasts 三十分钟

node.js is still bikeshedding because icu

Page 64: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 65: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 66: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 67: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 68: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 69: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 70: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 71: This talk lasts 三十分钟

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Page 72: This talk lasts 三十分钟

Conclusion• Proper localisation is part of our job to make the web useful for

everybody

• Use the libraries provided

• Whenever you think you know better than the library: No. You don’t.

• Remember that UI language and Locale are not always connected

• Don’t do IP geolocation for language choice

• When in doubt: Ask the user. She’ll know for sure.

Page 73: This talk lasts 三十分钟

Before I leave

Page 74: This talk lasts 三十分钟

""".length

[…"""].length

Page 75: This talk lasts 三十分钟

In case you answered 11 and 8, I salute you

Page 76: This talk lasts 三十分钟

Thanks everyone and enjoy your evening

Page 77: This talk lasts 三十分钟

• U+1F468 (MAN) 👨

• U+200D (ZERO WIDTH JOINER)

• U+2764 (HEAVY BLACK HEART) ❤

• U+FE0F (VARIATION SELECTOR-16)

• U+200D (ZERO WIDTH JOINER)

• U+1F48B (KISS MARK) 💋

• U+200D (ZERO WIDTH JOINER)

• U+1F468 (MAN) 👨