19
Machine Translation in Machine Translation in Academia and in the Academia and in the Commercial World: Commercial World: a Contrastive Perspective a Contrastive Perspective Alon Lavie Alon Lavie Research Professor – LTI, Carnegie Mellon University Research Professor – LTI, Carnegie Mellon University Co-founder, President and CTO – Safaba Translation Co-founder, President and CTO – Safaba Translation Solutions Solutions WMT-2014 June 26, 2014

Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Embed Size (px)

Citation preview

Page 1: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Machine Translation in Machine Translation in Academia and in the Academia and in the Commercial World:Commercial World:

a Contrastive Perspectivea Contrastive PerspectiveAlon LavieAlon Lavie

Research Professor – LTI, Carnegie Mellon UniversityResearch Professor – LTI, Carnegie Mellon University

Co-founder, President and CTO – Safaba Translation SolutionsCo-founder, President and CTO – Safaba Translation Solutions

WMT-2014 June 26, 2014

Page 2: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

04/20/23 WMT-2014 2

Page 3: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

LTI Education Committee

• Standing LTI Faculty committee mandated to review discuss and propose changes to the LTI education programs and course offerings

• Meets about once a month over lunch

• Primary activities include:– Reviewing new course proposals from faculty

– Assisting with speaker recruitment for the LTI colloquium

– Special tasks and projects related to our educational programs

• Current members: Bob Frederking, Carolyn Rose, Noah Smith, Alan Black, Eric Nyberg, Teruko Mitamura, Ralf Brown, Alon Lavie

December 8, 2011 11-711: Algorithms for NLP 3

Page 4: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

LTI Curriculum Review• Special project the committee took upon itself in the fall• Goals:

– Develop a more comprehensive understanding of the current state of our curriculum and how it has evolved over the years

– Are our current course offerings appropriate and necessary for our graduate programs?

– Do we have significant gaps that need to be filled?– Analyze student enrollment in our courses, how it has changed

over the years, and draw conclusions– Draw conclusions regarding potential changes in our course

offerings, their scheduling, frequency, and/or sequencing – Look at the LTI teaching requirements and salary compensation

model and whether it should be tweaked or modified

Page 5: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

LTI Curriculum Review

• So far mostly a fact and information gathering exercise with some limited amount of analysis performed by individual committee members

• Three main sub-tasks:– A comparison of our LTI course offerings with

similar course offerings at major competing peer institutions.

– An analysis of student enrollment data in our courses over the past 15 years.

– A basic-level comparison of the teaching requirements and teaching compensation model used across the various departments and units within SCS

Page 6: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

LTI Curriculum Review• A full report of findings from these three

activities was circulated by email yesterday• I will present highlights from the findings• Faculty discussion and guidance:

– What other information should we be gathering?– What kinds of analyses would you like to see on

this data?– Goal is to come up with some recommendations

regarding changes to our courses, our programs and/or our teaching salary compensation model.

– Full faculty will get to discuss any proposed changes

Page 7: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Comparison of LTI Course Offerings with Peer

Institutions• Compiled by Noah Smith and Ralf Brown• Looked at course offerings at Edinburgh, JHU

and Stanford and attempted to map these to equivalent courses at LTI/SCS

• Departmental structures are somewhat different

• Table of LTI courses and their corresponding equivalents

• Table of SCS courses typically taken by LTI students and their corresponding equivalents

• Table of courses offered by peers that we don’t have

Page 8: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Comparison of LTI Course Offerings with Peer

Institutions• General Findings:

– We are very strong on speech offerings, maybe rivaled by JHU.– We are stronger than these peers in information retrieval offerings.– We are relatively weak on linguistics offerings.

• Courses that make the LTI special, compared to this set of peers:– Grammars and lexicons (721) has a “grammar engineering”

analogue at Stanford, but is unique in being an LT-oriented introduction to the phenomena of human language.

– Machine translation (731) as a full-on course– Structured prediction (763), an advanced statistical NLP course

(this course combines two older courses, Language and Statistics 2 (762) and Information Extraction (748)).

– Social media analysis (772).– Software engineering courses (791 and 792) that emphasize

language technologies.– Inventing future services (794).– Summarization and personal information management (899).

Page 9: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Comparison of LTI Course Offerings with Peer

Institutions• Obvious ideas for courses offered by peers

but not by LTI:– Intro to programming for language technologies,

for new MLTs who lack a CS background. This could become a service course for CS masters and PhD students from other applied SCS departments who need to catch up on programming skills quickly.

– Bioinformatics. Should discuss with faculty in the Lane Center for Computational Biology.

– Cognitive science of language. Should discuss with faculty in Psychology.

– Data mining (and text mining); likely of interest to some students in Tepper and Heinz.

– Corpus linguistics. Should discuss with Linguistics faculty in Modern Languages, English, and Philosophy.

Page 10: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Enrollment Data Analysis

• Compiled by Bob Frederking• Based on a spreadsheet generated from a

database dump containing every registration for an 11-xxx course since Fall 1996.

• There is a line in the spreadsheet for each student in each class each semester, for a total of 7328 raw data points.

• Note that this total includes 119xx research registrations and 11700 LTI Colloquium registrations. These have been filtered out of the following charts, except where explicitly shown.

Page 11: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 12: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 13: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 14: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 15: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 16: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 17: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

Page 18: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Course Enrollments

11319 1 11521 1 11552 1 11554 1 11561 1 11590 1 11592 1 11724 1 11727 1 11135 2 11390 2 11512 2 11746 2 11513 3 11541 3 11717 3 11747 3 11531 4 11691 4 11735 4 11795 4 11726 5 11783 5 11611 6 11695 6 11773 6 11683 7 11490 8 11511 8 11718 9 11749 9 11782 9 11693 10 11441 11 11728 11 11736 11

11755 11 11793 11 11120 12 11617 12 11725 13 11928 13 11765 15 11767 15 11763 18 11794 18 11929 19 11723 20 11744 20 11796 20 11756 21 11719 24 11733 24 11899 25 11716 27 11780 30 11753 32 11743 35 11734 36 11772 37 11682 38 11344 39 11713 40 11745 40 11754 50 11762 51 11742 54 11748 54 11732 57 11722 68 11752 71 11920 78 11731 94 11925 94 11411 95 11935 118 11751 148 11712 164

11792 213 11721 239 11761 292 11741 294 11711 313 11930 451 11791 484 11700 496 11910 2519

Total cumulative course enrollments sorted by size

Page 19: Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor – LTI, Carnegie Mellon University

Discussion