24
Strings and Characters in Swift Goichi Hirakawa

Strings and Characters in Swift

Embed Size (px)

Citation preview

  • Strings and Characters in Swift Goichi Hirakawa

  • (Goichi Hirakawa)

    @gooichi

    OS X / iOS

    Objective-CXX

  • String

    String

    String

    String

  • String UnicodeUnicode-Correct

    Local-Insensitive

    Value Type

    4Views

    Objective-CObjective-C Bridge

  • Unicode

    SwiftUnicode

    UTF-8UTF-16UTF-32

    ==

  • Swift

    Unicode

    NSStringAPI

  • let

    Copy-on-Write

  • 4

    CharacterView

    UnicodeScalarView

    UTF16View

    UTF8View

  • Objective-C

    NSStringObjective-C

    NSStringString

    NSStringString1.2

  • String String

    characters: Character

    unicodeScalars: UnicodeScalarUnicode

    utf16: UTF16.CodeUnitUTF-16

    utf8: UTF8.CodeUnitUTF-8

    NSStringUTF-16 UTF-16

  • Unicode Code Point

    Unicode; 010FFFF21-bit

    Code Unit

    bit

    UnicodeUnicode Scalar SwiftUnicodeScalar

    0D7FFE00010FFFF

    UTF-32

    Extended Grapheme Cluster SwiftCharacter

    1Unicode

  • Dog

    Character D U+0044o

    U+006Fg

    U+0067

    U+203C

    U+1F436

    Unicode Scalar (UTF-32)

    0x44 0x6F 0x67 0x203C 0x1F436

    0 1 2 3 4

    UTF-16

    0x44 0x6F 0x67 0x203C 0xD83D 0xDC36

    0 1 2 3 4 5

    UTF-8

    0x44 0x6F 0x67 0xE2 0x80 0xBC 0xF0 0x9F 0x90 0xB6

    0 1 2 3 4 5 6 7 8 9

  • String Character(s)

    UnicodeScalar(s)

    UTF16View

    UTF8View

    Number, Streamable, CustomStringConvertible, CustomDebugStringConvertible, etc

    NSString

  • String

    NSString

    CJKCJK

  • ==!=

    String / Character

  • Unicode Canonical Equivalent

    Unicode

    U+00E9 () == U+0065 (e) + U+0301 ( )

    Compatibility Equivalent

    U+2460 () == U+0031(1)

  • NSString

    -isEqualToString:UTF-16

    NSString==-isEqual:-isEqualToString:

    compare:Undocumented

  • (1) Swift let string1 = "\u{E9}" // let string2 = "\u{65}\u{301}" // e + print(string1 == string2) // true

    Objective-C let nsString1: NSString = string1 print(nsString1 == string2) // false print(nsString1.isEqualToString(string2)) // false print(nsString1.compare(string2) == NSComparisonResult.OrderedSame) // true

  • CJKCJK CJK

    Unicode

    CJK

    IBMJIS X 0213

    CJK

    CJK

  • (2) let unifiedKanji = "\u{585A}" //

    let compatiKanji = "\u{FA10}" //

    print(unifiedKanji == compatiKanji) // true

    let unifiedKanji = "\u{5D0E}" //

    let nonCompatiKanji = "\u{FA11}" //

    print(unifiedKanji == nonCompatiKanji) // false

  • IVS: Ideographic Variation Sequence/Selector

  • CJK U+585A CJK U+FA10

    IVSU+585A U+E0100 1 U+585A U+E0103 2 U+585A U+E0101 1 U+585A U+E0105 2

    3 U+585A U+FE01 U+585A U+FE00

    1 Adobe-Japan1 2 Hanyo-Denshi 3 UnicodeIVS

  • (3) CJKCJK let unifiedKanji = "\u{585A}" //

    let compatiKanji = "\u{FA10}" //

    print(unifiedKanji == compatiKanji) // true

    IVSAdobe-Japan1Hanyo-Denshi let adobeKanji = "\u{585A}\u{E0100}" //

    let hanyoKanji = "\u{585A}\u{E0103}" //

    print(adobeKanji == hanyoKanji) // false

  • String

    NSStringUnicodeSwiftUnicode

    Swift 2NSString API