10
May 19th, 2015 NLIE Native Language Identification Engine Lee Levi and Yael Avinun

Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

May 19th, 2015

NLIE Native Language Identification Engine

Lee Levi and Yael Avinun!

Page 2: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Problem Description

Companies with support and sales websites and services rely on user location or manual selection to direct users to service in the relevant language. !

This process has the potential to be automated and more precise, and therefore speed up processes as well as increase sales.

Page 3: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Problem Background

✤ Size: 100,000+ companies of all sizes!

✤ Severity: Medium. Decreases sales and customer retention, providing a less personal approach.!

✤ Consumers: Any company that offers online sales, support, or multi-language services

Page 4: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Current Solution

✤ Manual selection of language preference!

✤ User-location detection!

✤ User-initiated request for specific language-speaking agent

Page 5: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution - General Overview

✤ An API for companies that have multi-language online sales and support, which identifies in real-time, from English text, what the native language of the customer is.!

✤ Business Model: Pay-per-service. Free to detect english/non-english, Premium to detect specific languages (priced depending on the number of languages).

Page 6: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution - GUI

✤ The GUI will be a website for management of the API, where the company will be provided with the code and documentation, as well as (potentially) analytics of the company’s use.

Page 7: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution - Data

✤ Lang-8 English Learner Corpus

Page 8: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution - High-Level Design

CorpusPOS Tagger

N-Gram Tagger

Model!Classifier !Training

Offline

OnlineUser!Input

NLIE!Engine

Output

Trained!Model!

Classifier

Page 9: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution - Off-the-Shelf Libraries

✤ NLTK!

✤ SciKit-Learn!

✤ Pickle

Page 10: Native Language Identification Enginenin/Courses/Workshop15a/NLIE.pdf · Native Language Identification Engine Lee Levi and Yael Avinun! Problem Description Companies with support

Our Solution

1. Milestones:!

1. Offline Work (until 31.5) : Choose a machine learning approach, implement and test the abilities of the classifier!

2. API Implementation (until 7.6): Create an API for the engine that a company could use on top of their own code.!

3. GUI (until 21.6): Implement a multi-user website which includes documentation, billing, information, and potentially, user-specific analytics.