20
如何連登 data 訓練廣東話 ChatBot

如何用連登 data 訓練廣東話 chatbot (How to use data from a popular forum to train a Cantonese chatbot)

  • Upload
    oursky

  • View
    92

  • Download
    5

Embed Size (px)

Citation preview

  • data

    ChatBot

  • https://t.me/lihkg_9up_bot

    9up chatbot

    https://t.me/lihkg_9up_bot

  • 9up bot ?

  • ?? (?????)

    ...

  • ?

  • Machine Learning data

    seq2seq (sequence to sequence)

  • ??

    Tensor Flow

    implement seq2seq framework

    seq2seq (sequence to sequence)

  • ??

    ??

    seq2seq?

  • seq2seq?Google Translate

    Google Inbox Auto Reply Google Allo

  • data ?

    ( )

  • Data 1: 19 10

  • ()

    (Input) (Output)

    (Input) (Output)

  • Data 1: 19 10

    100,000

  • Again data ?

    ( )

  • Data 2:

    64,751 post

    10 reply post

    18,290

  • Data 2:

    18,290 post 1,700,000

    program postdata

    :

    1. Quote Replyreply

    2. Quote Replypost title+

    3. postTitle

  • training data

    metal? metal

    metal

    metal

    Quote Reply

    Quote Reply, post title+

  • 9up bot API ? For developers

  • Model Parameters SummaryCustom configurations

    seq2seq model with attention

    5 layers encoder and decoder

    vocabulary size: 63000

    state vector size: 256

    learning rate: 0.5

    learning decay factor: 0.99

    batch size: 64

    bucket sizes (encoder length, decoder length)

    10, 10

    20, 30

    40, 30

    60, 30