82
Introduction to Unicode & i18n in Rust Behnam Esfahbod Software Engineer

Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Introduction to Unicode & i18n in Rust

Behnam Esfahbod Software Engineer

Page 2: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

The Rust Programming Language has native support for Unicode Characters’ Unicode Scalar Values, to be exact. The language provides fast and compact string type with low-level control over memory consumption, while providing a high-level API and enforcing memory and data safety at compile time. The Rust Standard Library covers the basic Unicode functionalities, and third-party libraries – called Crates – are responsible for the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization data and algorithm, and tools to build them, designed to have reusable modules and easy-to-use and efficient API.

In this talk we will cover the basics of Rust's API for characters and strings, and look under the hood of how they are implemented in the compiler and the standard library. Afterwards, we look at UNIC's design model, how it implements various features, and lessons learned from building sharable organic micro components.

The talk is suitable for anyone new to Unicode, or Unicode experts who like to learn about how things are done in the Rust world.

Abstract

42nd

Internationalization &

Unicode Conference

September 2018

Santa Clara, CA, USA

!2

Page 3: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

The Rust Programming Language has native support for Unicode Characters’ Unicode Scalar Values, to be exact. The language provides fast and compact string type with low-level control over memory consumption, while providing a high-level API and enforcing memory and data safety at compile time. The Rust Standard Library covers the basic Unicode functionalities, and third-party libraries – called Crates – are responsible for the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization data and algorithm, and tools to build them, designed to have reusable modules and easy-to-use and efficient API.

In this talk we will cover the basics of Rust's API for characters and strings, and look under the hood of how they are implemented in the compiler and the standard library. Afterwards, we look at UNIC's design model, how it implements various features, and lessons learned from building sharable organic micro components.

The talk is suitable for anyone new to Unicode, or Unicode experts who like to learn about how things are done in the Rust world.

Abstract

42nd

Internationalization &

Unicode Conference

September 2018

Santa Clara, CA, USA

!3

Page 4: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Fluent 1.0 — Next Generation Localization System from Mozilla by Zibi Braniecki

Localization systems have been largely stagnant over the last 20 years. The last major innovation - ICU MessageFormat - has been designed before Unicode 3.0, targeting C++ and Java environments. Several attempts have been made since then to fit the API into modern programming environments with mixed results.

Fluent is a modern localization system designed over last 7 years by Mozilla. It builds on top of MessageFormat, ICU and CLDR, bringing integration with modern ICU features, bidirectionality, user friendly file format and bindings into modern programming environments like JavaScript, DOM, React, Rust, Python and others. The system comes with a full localization workflow cycle, command line tools and a CAT tool.

With the release of 1.0 we are ready to offer the new system to the wider community and propose it for standardization.

Looking for L10n in Rust?

Happening NOWon Track 3!

!4

Page 5: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

• Software Engineer @ Quora, Inc.

• Co-Chair of Arabic Layout Task Force @ W3C i18n Activity

• Virgule Typeworks

• Facebook, Inc.

• IRNIC Domain Registry

• Sharif FarsiWeb, Inc.

About me

!5

Page 6: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

• Quick Intro to Rust

• Characters & Strings

• It Gets Complicated!

• On Top of the Language

This talk

!6

Page 7: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Quick Intro to Rust

Page 8: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

XKCD | GOTO [CC BY-NC 2.5]!8

Page 9: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

History

!9

• 2006: The project started out of a personal project of Graydon Hoare

- OCaml compiler

• 2009: Mozilla began sponsoring

• 2011: Self-hosting compiler, using LLVM as backend

Page 10: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

History

!10

• 2006: The project started out of a personal project of Graydon Hoare

- OCaml compiler

• 2009: Mozilla began sponsoring

• 2011: Self-hosting compiler, using LLVM as backend

• Pre-2015: Many design changes

- Drop garbage collection

- Move memory allocation out of the compiler

• 2015: Rust 1.0, the first stable release

• 2018: First major new edition, Rust 2018

Page 11: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Build System & Tooling

!11

• Cargo

- Package manager

- Resolve dependencies

- Compile

- Build package and upload to crates.io

Page 12: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!12

• Cargo

- Package manager

- Resolve dependencies

- Compile

- Build package and upload to crates.io

• Common tooling

- Rustup

- Rustfmt

- Clippy

- Bindgen

Build System & Tooling

Page 13: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Systems Language

!13

• Abstraction without overhead (ZCA)

- & without hidden costs

Page 14: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Systems Language

!14

• Abstraction without overhead (ZCA)

- & without hidden costs

• Compile to machine code

- & runs on microprocessors (no OS/malloc)

Page 15: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Systems Language

!15

• Abstraction without overhead (ZCA)

- & without hidden costs

• Compile to machine code

- & runs on microprocessors (no OS/malloc)

• Full control of memory usage

- Even where there’s no memory allocation

Page 16: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Systems Language

!16

• Abstraction without overhead (ZCA)

- & without hidden costs

• Compile to machine code

- & runs on microprocessors (no OS/malloc)

• Full control of memory usage

- Even where there’s no memory allocation

• Compiles to Web Assembly

- & runs in your favorite browser

Page 17: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Typed Language

!17

• Statically typed

- All types are known at compile-time

- Generics for data types and code blocks

Page 18: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Typed Language

!18

• Statically typed

- All types are known at compile-time

- Generics for data types and code blocks

• Strongly typed

- Harder to write incorrect programs

- No runtime null-pointer failures

Page 19: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Syntax

!19

Similar to C/C++ &

Java

Page 20: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Type System

!20

• Algebraic types

- First Systems PL

- Tuples, structs, enums, & unions

- Pattern matching (match) for selection and destructure

• Some basic types

- Option enum type: Some value, or None

- Result enum type: Ok value, or Err

Page 21: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Type System

!21

Option (example)

Page 22: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Type System

!22

Result (definition)

Page 23: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Type System

!23

Result (example)

Page 24: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Memory Management & Safety

!24

• No garbage collection

- Strict memory management

Page 25: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Memory Management & Safety

!25

• No garbage collection

- Strict memory management

• Ownership

- Memory parts are owned by exactly one variable

- Destruct memory when variable goes out of scope

Page 26: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Memory Management & Safety

!26

• No garbage collection

- Strict memory management

• Ownership

- Memory parts are owned by exactly one variable

- Destruct memory when variable goes out of scope

• Borrow checker

- Data-race free

- Similar to type checker

- Either read-only pointers or one read-write pointer

Page 27: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Memory Management & Safety

!27

• No garbage collection

- Strict memory management

• Ownership

- Memory parts are owned by exactly one variable

- Destruct memory when variable goes out of scope

• Borrow checker

- Data-race free

- Similar to type checker

- Either read-only pointers or one read-write pointer

• Lifetimes

- ≈ Position in the stack that owns the heap allocation

Page 28: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!28

• Traits

- Define behavior (can’t own data)

- Inheritance

- Deref

Interfaces & Impl.s

Page 29: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!29

• Traits

- Define behavior (can’t own data)

- Inheritance

- Deref

• Impl blocks

- Implement types and traits (can’t own data)

- Composition

Interfaces & Impl.s

Page 30: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!30

• Traits

- Define behavior (can’t own data)

- Inheritance

- Deref

• Impl blocks

- Implement types and traits (can’t own data)

- Composition

• Code blocks

- Functions, methods and closures

Interfaces & Impl.s

Page 31: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!31

• Traits

- Define behavior (can’t own data)

- Inheritance

- Deref

• Impl blocks

- Implement types and traits (can’t own data)

- Composition

• Code blocks

- Functions, methods and closures

• Macros

- assert!(), format!(), print!(), println!()

Interfaces & Impl.s

Page 32: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Characters & Strings

Page 33: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Numeric Types

!33

• Signed & unsigned integer types

Page 34: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Numeric Types

!34

• Signed & unsigned integer types

• Floating-point types

- f32, f64

Page 37: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Character Type

!37

Unicode scalar

values

• As defined by The Unicode Standard

- “Any Unicode code point, except high-surrogate and low-surrogate

code points.”

- U+0000 to U+D7FF (inclusive)

- U+E000 to U+10FFFF (inclusive)

- Total of 1,112,064 code points

Page 38: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Character Type

!38

Limited integer

type

• No numerical operations on the char type

- What would the result of `U+D7FF + 1`?

Page 39: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Character Type

!39

Algebraic types in

action

• Compiler knows that all values of the 4 bytes are not used!

Page 40: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!40

Page 41: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Pointer Types

!41

• Narrow pointers

- Point to Sized types (size is known at compile-time)

- Single usize value

Page 42: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Pointer Types

!42

• Narrow pointers

- Point to Sized types (size is known at compile-time)

- Single usize value

• Fat Pointers

- Point to something with unknown size (at compile-time)

- Single usize value, plus more data

Page 43: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Arrays,Slices & Vectors

!43

• Arrays

- Sized sequence of elements

- [T; size] - Unsized sequence of elements

- [T]

Page 44: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Arrays,Slices & Vectors

!44

• Arrays

- Sized sequence of elements

- [T; size] - Unsized sequence of elements

- [T]

• Slice

- A view into a sequence of elements

- &[T]

- On arrays, vectors, …

Page 45: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Arrays,Slices & Vectors

!45

• Arrays

- Sized sequence of elements

- [T; size] - Unsized sequence of elements

- [T]

• Slice

- A view into a sequence of elements

- &[T]

- On arrays, vectors, …

• Vector

- A dynamic-length sequence of

elements

- Sit in the heap

Page 46: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

String Types

!46

• str

- A special [u8]

- Always a valid UTF-8 sequence

Page 47: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

String Types

!47

• str

- A special [u8]

- Always a valid UTF-8 sequence

• &str

- A special &[u8]

Page 48: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

String Types

!48

• str

- A special [u8]

- Always a valid UTF-8 sequence

• &str

- A special &[u8]

• String

- A dynamic UTF-8 sequence

- Return type of str functions that

cannot guarantee preserving bytes

length

Page 49: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

String Operations

!49

ASCII-only

• Apply to both &[u8] and &str

Page 50: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

String Operations

!50

Non-ASCII Unicode

• Apply only to &str

Page 51: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Iterating Strings

!51

• Iterating over characters of a string

Page 52: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

!52

Page 53: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

It Gets Complicated!

Page 54: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-Platform Encoding Challenges

• OS & environment variables

- File Names

- Environment variables

- Command-line parameters

• Different per system

- Unix: bytes; commonly UTF-8 these days

- Windows: UTF-16, but not always well-formed

!54

Page 55: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-Platform Encoding Challenges

• OS & environment variables

- File Names

- Environment variables

- Command-line parameters

• Different per system

- Unix: bytes; commonly UTF-8 these days

- Windows: UTF-16, but not always well-formed

• OsStr and OsString

- Internal data depends on OS

- &OsStr is to OsString as &str is to String

!55

Page 56: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-PlatformData Types

• OsStr and OsString

- Internal data depends on OS

- &OsStr is to OsString as &str is to String

!56

Page 57: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-PlatformData Types

• OsStr and OsString

- Internal data depends on OS

- &OsStr is to OsString as &str is to String

• Trait std::ffi::OsStr

- pub fn to_str(&self) -> Option<&str>

!57

Page 58: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-PlatformData Types

• OsStr and OsString

- Internal data depends on OS

- &OsStr is to OsString as &str is to String

• Trait std::ffi::OsStr

- pub fn to_str(&self) -> Option<&str>

• Trait std::os::unix::ffi::OsStrExt

- fn from_bytes(slice: &[u8]) -> &Self

- fn as_bytes(&self) -> &[u8]

!58

Page 59: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Cross-PlatformData Types

• OsStr and OsString

- Internal data depends on OS

- &OsStr is to OsString as &str is to String

• Trait std::ffi::OsStr

- pub fn to_str(&self) -> Option<&str>

• Trait std::os::unix::ffi::OsStrExt

- fn from_bytes(slice: &[u8]) -> &Self

- fn as_bytes(&self) -> &[u8]

• Trait std::os::windows::ffi::OsStrExt

- fn encode_wide(&self) -> EncodeWide

!59

Page 60: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Working with C APIs

• CStr and CString

- A borrowed reference to a nul-terminated array of bytes

- CStr is to CString as &str is to String

!60

Page 61: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Working with C APIs

• CStr and CString

- A borrowed reference to a nul-terminated array of bytes

- CStr is to CString as &str is to String

• Trait std::ffi::CStr

- pub unsafe fn from_ptr<'a>(ptr: *const c_char) -> &'a CStr

- pub fn to_str(&self) -> Result<&str, Utf8Error>

!61

Page 62: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

On Top of the Language

Page 63: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Unicode & i18n Crates

!63

• Encoding/Charsets

- Firefox is already using a Rust component for that!

• Rust Project

- String algorithms needed for a compiler

• Servo Project

- Basic string algorithms needed for a rendering engine

• Locale-aware API

- Actually not much available yet

- WIP by Mozilla, et al.

Page 64: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Rust Compiler & Libraries

!64

UNIC Experiment

UNIC: Unicode and

i18n Crates for Rust

Core / core Standard Library / std

UNIC

Character Utilities / unic::char::*

Unicode Character Databaseunic::ucd::[ bidi, name, age, normal, ident, segment, … ]

Unicode Bidirectional

Algorithm unic::bidi

Unicode Normalization

Algorithm unic::bidi

Unicode Segmentation

Algorithm unic::segment

Unicode IDNA

unic::idna

Unicode Emoji

unic::emoji…

Page 65: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

What’s this?

!65

Hello سالم

Page 66: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

How about this?

!66

Hello سالم

Page 67: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

A Case of Missing Bidi Context

How about Locale

Context?

!67

Hello سالمHello سالم

Page 68: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Programming Languages

!68

• Machine language

Page 69: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Programming Languages

!69

• Machine language

• Procedural

• GOTO

Page 70: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Programming Languages

!70

• Machine language

• Procedural

• GOTO

• Functional

Page 71: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Programming Languages

!71

• Machine language

• Procedural

• GOTO

• Functional

• Garbage collection

Page 72: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Programming Languages

!72

• Machine language

• Procedural

• GOTO

• Functional

• Garbage collection

• Strict memory management

Page 73: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!73

• Byte == Char

Page 74: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!74

• Byte == Char

• Contextual Charset

Page 75: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!75

• Byte == Char

• Contextual Charset

• Separation of text encoding & font encoding

Page 76: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!76

• Byte == Char

• Contextual Charset

• Separation of text encoding & font encoding

• Unified encoding

Page 77: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!77

• Byte == Char

• Contextual Charset

• Separation of text encoding & font encoding

• Unified encoding

• Contextual Local

Page 78: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Summary:Unicode & i18n

!78

• Byte == Char

• Contextual Charset

• Separation of text encoding & font encoding

• Unified encoding

• Contextual Local

• ???

Page 79: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

HOPE

Page 80: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Additional Resources

!80

• Rust Community

- rust-lang.org

- doc.rust-lang.org

- play.rust-lang.org

- users.rust-lang.org

- reddit.com/r/rust/

- rustup.rs

- crates.io

- unicode-rs.github.io

- newrustacean.com

• Servo, the Parallel Browser Engine Project

- servo.org

• UNIC: Unicode and Internationalization Crates for Rust

- https://github.com/open-i18n/rust-unic

Page 81: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization

Questions?

質問? ׁשְאֵלֹות?

प्रश्न?

질문?

سؤال؟پرسش؟

Page 82: Behnam Esfahbod Software Engineer · the rest. UNIC’s Unicode and Internationalization Crates for Rust is a project to develop a collection of crates for Unicode and internationalization