Empirical Analysis of Programming Language Adoption

Preview:

DESCRIPTION

Why some programming languages succeed and others fail. OOPSLA 2013, Best Paper.

Citation preview

Leo A. Meyerovich, UC Berkeley Ariel S. Rabkin, PrincetonOctober, 2013

EMPIRICAL ANALYSIS

OF PROGRAMMING

LANGUAGE ADOPTION

2

Why Adoption?

3

Confession of a Language Salesman

Change Function threshold to adopt:

[P. Coburn]

perceived adoption needperceived adoption pain > 1

FP!!!new language

4

- Erik Meijer

“From now on, my goal in life would be to also drive the denominator down to zero”

Confessions of a Used Programming Language Salesman

Confession of a Language Salesman

5

Change Function threshold to adopt:

perceived adoption needperceived adoption pain > 1

FP!!!new language

FP!!familiar

language

[P. Coburn]

Confession of a Language Salesman

6

Science?

Adoption literaturechange function is switching

costs

Data analysisgrowth

decision making acquisition

TODAY

7

Our Data Sets

2 year long web survey

13,271 respondents

massive open online

course (MOOC) survey

1,142 respondents

software repositor

ies217,368 projects

2 week web survey1,679

respondents

[McIver]

[Patterson & Fox]

Viral Campaign

8

Demographics

Age: ~30

Degree: ~BS in CS

Employment: ~programmer

9

How do languages grow?

10

Ecological model of adoption

Use languagein a niche Grow libraries

and user base

Spread language to more niches

110%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

xml

html

cssjavascript

javashell

cmake

pythonc++

phprubyc# sql bat

Popular Languages CDF (Ohloh data)

Language

Cumu-lativeUse

Half the projectsuse 5 languages

120%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

xml

html

cssjavascript

javashell

cmake

pythonc++

phprubyc# sql bat

Popular Languages CDF (Ohloh data)

Language

Cumu-lativeUse

DSLsdominate

Half the projectsuse 5 languages

13

0.127 1.27 12.7 1270.0100%

0.1000%

1.0000%

10.0000%

100.0000%

Language Rank (Decreasing)

Propor-tion of Projects

for Lan-

guage

Odds for Most Languages? (PDF)

Long Tail!Supports designing for niches and then

growing

Java for 16% of projects

Processing for 0.09% of projects

14

Projects (2000-2010)200K+[PLATEAU 2013]

-20%

0%

20%

40%

60% Java

Project categories (223)

0%

1%

2%

3%

4%

Scheme

Project categories (223)

Popularity Across Niches

15

blogging: 9%

search: 29%

build tools: 1%

Popularity

Popularity

-20%

0%

20%

40%

60%

Project categories (223)

0%

1%

2%

3%

4%

Project categories (223)

Popularity Across Niches

16

high dispersion

low dispersion

Popularity

Popularity

17

00.511.522.533.544.5

PrologVBScriptScheme

Fortran

PL/SQLAssemblyC#

Java

Dispersion across niches(σ / μ)

Pop

ula

rity

Dispersion Decreases as Popularity Increases

Languages grow niche by niche

18

How Do Programmers Pick Languages?

19

P(L’ | L)

p(popular)75%

p(repeat)30%

Shows importance of

familiarity

20

How Do Languages Get Picked?

strongly agreestrongly disagree neutral

Development

speed?

Performance?

21

Open source libraries

Group legacy

Project legacy

Self familiarity

Team familiarity

Target platform

Performance

Tooling

Development speed

Hiring

Individual feature(s)

Correctness

Simplicity

Commercial libraries

0% 10% 20% 30% 40% 50% 60% 70% 80%

Relative Importance of Language Aspects (Med-Strong)

Slashdot survey, Companies with 1-19 employees

Intrinsics:performance,correctness,

Extrinsic niche-specific factors dominate!

Be Positive: Design Guides & Opportunities

22

Learning: Shelf Life of a Programmer?

“Baby Boomers and Gen Xers tend to know C# and SQL.

Gen Y knows Python… and Hadoop”Recruiter

Language Users are Age-Invariant

Languages are learned and forgotten

Programmershave a working setthat they refresh!

25

Median reported time requiredto “learn a language well”

Time to learn is short compared to career

26

Probability of Knowing a Language

AllCS

Major

Not CS

Major

Taught in

school

Not Taught

in school

Functional Scheme, ML, ...

22% 24% 19% 40% 15%

AssemblyMIPS, … 14% 14% 14% 20% 10%

Mathematical Matlab, R, …

11% 10% 11% 31% 7%CS degree unimportant

but coursework matters

27

ConclusionsExtrinsics dominate: Libraries and familiarity!

Model: Niche-by-niche growth

Intrinsics secondary:Performance, semantics, IDEs

Fluidity = Hope: Programmers know few languages but can refresh within 6 months.

28

Looking AheadLanguage SociologyProgramming is done by groups; big knowledge gaps

Streamline EmpiricismSurveys, experiments (mining already active)Exploit MOOCs!

Social Language DesignImprove sharing and utilize networks

29

Socio-PLTwww.eecs.berkeley.edu/~lmeyerov

Recommended