Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
The Hack programming language:Types for PHP
Andrew KennedyFacebook
Facebook’s PHP Codebase
▪350,000 files▪>10,000,000 LoC (www.facebook.com & internally)▪1000s of commits per day, 2 releases per day▪Anecdotally, good engineers are really productive in
PHP
▪And yet…
“1” + “2”
3
How far does this go?
“1ne” + “2wo”
3
How far does this go?
“15” + “0xF”
15
How far does this go?
“15” == “0xF”
true
What if it’s not a numeric string?
“hagfish” + “2”
2
OK, it treats non-numeric strings as zero.
“hagfish” | “9000000”
“yqwvysx”
It gets worse.
$n = “hagfish”;$n++;
“hagfisi”
It gets even worse. It never ends.
$n = “z”;$n++;
“aa”
What to do?Types!*
*And quite a lot of other things: remove features (e.g. “variable variables”), add async, typed XML syntax, …
Some history▪ First focus for Facebook was performance of PHP
▪ 2009: HipHop: translation from PHP to C++
▪ 2013: HHVM: highly performant runtime used by FB, Wikipedia, …
▪ Dynamic checking of type ‘hints’ => static type system => Hack (Julien Verlaguet). Really two projects
▪ A programming language project (this talk)
▪ A systems project (make it scale to 10M lines, with parallelism, background incremental type checking, etc.)
▪ Also see Flow (Akiv Chaudhuri), a similar effort for Javascript
▪ Types matter at Facebook!
Hack: types for PHP• Object-oriented type system with generics in the style of Java, C# or
Scala
• Some structural subtyping (tuples, shapes, functions)
• ML-like type inference based on unification
• Flow-sensitive typing for locals
• Type refinement for isnull/type tests
• Internal use of union types and recursive types
• ML-style abstract types
• Gradual typing for mixed code (PHP, Hack)
Pragmatism• We’re typing code that already exists!
• Lots of special casing for common PHP idioms
• Driven by need to convert millions of lines of code & convert hundreds of reluctant developers.
• => Language has a “materials in the room” feel to it.
• Materials being drawn from many years of p. l. research
abstract class ChunkIterable<Tk, +Tv> { abstract public function getIterator(): AsyncIterator<ResultChunk<Tk, Tv>>;
abstract protected function getIteratorWithCursor( ?ChunkCursor $from, ?ChunkCursor $to, bool $iterate_backwards): AsyncIterator<(ChunkCursorMaker<Tk>, ResultChunk<Tk, Tv>)>;
final public function filter<Tu super Tv>(IChunqPredicate<Tu> $predicate): ChunkIterable<Tk, Tu> { return new FilteredChunkIterable($this, $predicate);
}
Rich object-oriented type system▪ Primitive types, named classes, interfaces, and traits, with static and
virtual methods
▪ Generic type parameters on types and methods, with variance annotations and lower/upper bounds
Named classType parameter
Covariant type parameter
Maybe/optiontype
Lower bound
More inference than Java, C# or Scala▪ Type annotation on function arguments and results only
▪ Types inferred for locals; type parameters inferred for new and generic methods
▪ Note: no type-based overloading! (contrast Java, C#)
class List<T> { … }function MakeSingleton<T>(T x): List<T> { ... }function foo(int $b): void {$y = new List();$y->Add($b);$z = new List();$z->Add($y);$s = MakeSingleton($z);…
}
Inferred to be List<List<int>>
Type parameter is inferred
▪ Locals aren’t even declared in PHP
Flow-sensitive typing of locals
function f( $b) {if ($b) {$x = ‘b’;bar($x);$x = 12;
}else {$x = ‘a’;
}return $x;
}
What types can we write on parameter and result?
▪ Locals aren’t even declared in PHP
Flow-sensitive typing of locals
function f(bool $b): mixed {if ($b) {$x = ‘b’;bar($x);$x = 12;
}else {$x = ‘a’;
}return $x;
}
What types can we write on parameter and result?
Flow-sensitive refinement▪ Types in Hack do not contain null by default. Must write “?type” to
include null.
▪ At last! Tony Hoare’s billion-dollar mistake, rectified
▪ Null tests in conditionals refinethe type inside the branch
function foo(?int $xopt):int {if ($xopt == null) {return 42;
} else {return $xopt;
}}
▪ Similarly, can test dynamic class using instanceof
function bar(Widget $a):void {if ($a instanceof Button) {$a->Click();
} else {$a->DoSomethingGeneric();
}}
Type of $xopt is now int
Flow-sensitive refinement: expressions▪ Types of some expressions can be refined. But care needed!
class C {private ?int $f = 0;function setToNull(): void { $this->f = null;
}function get(): int {if ($this->f == null) {return 0;
} else {
return $this->f;}
}
Type of $this->f is now int
Flow-sensitive refinement: expressions▪ Types of some expressions can be refined. But care needed!
class C {private ?int $f = 0;function setToNull(): void { $this->f = null;
}function get(): int {if ($this->f == null) {return 0;
} else {$this->setToNull(); return $this->f;
}}
Type is invalidated by function call
Type is invalidated by function call
TYPE ERROR!
Internal types▪ Internally, Hack uses a kind of “union” type for flow-sensitive typing
class A { function Foo():int { ... } }class B { function Foo():string { ... } }
function foo(bool $b, A $x, B $y): mixed {if ($b) {$obj = $x;
} else {$obj = $y;
}$result = $obj->Foo();
}
Hack gives $obj the type A | B
Hack gives $result the type int | string
int | string is a subtype of mixed
Structural typing▪ PHP uses arrays a lot. Arrays can be indexed by integer or string; they’re
extensible; and values are dynamically typed.
▪ PHP arrays are often used in idiomatic ways; these are reflected in structural types in Hack:
▪ Tuples e.g. tuple(int, string, MyClass)
▪ Associative maps e.g. array<string,Item>
▪ Records a.k.a. shapes e.g. shape('id' => int, 'name' => string)
▪ Also: function types, with proper co/contra-variant subtyping
Type abstraction▪ Where they are used, types are surprisingly strong
▪ Abstraction is enforced on enumeration types (contrast C#)
▪ Opaque types, with optional supertype
newtype UserId = string;
No type compatibility between NodeColour and int
enum NodeColour : int = { Red = 0; Black = 1; }
Outside file, no compatibility betweenUserId and string
Gradual typing▪ Hack code is marked as strict, partial or decl
▪ Strict code has full type annotations and is fully type checked. It cannot call into legacy PHP code
▪ Partial code has optional annotations and is type checked as much as it can. It can call into legacy PHP code
▪ Decl code is not checked; but type annotations are processed for use by other files
▪ Where types are omitted, Hack assumes an “any” type that is compatible with all other types
▪ Contrast mixed which is the “top” type w.r.t. subtyping
Type safety
▪The intention is that strict mode code is type safe▪But no soundness theorem; and what would it say about mixed code?▪Also, plenty of back doors e.g. invariant construct
Implementation▪ Hack is implemented in Ocaml
▪ Core of type checker is purely functional
▪ Open sourced on github: see http://hacklang.org
That's all folks. Questions?