Building a Better PHP — Part 3: Getting Started with Hack

Building a Better PHP — Part 3: Getting Started with Hack

We briefly looked at hack in part one of this series, but there is a lot more to it than type-hints.

Why Hack?

The main reason for Hack, is not to alleviate any number of small bugs that can creep into PHP code due to lack of strong typing, but instead to provide new language features and tooling to make developers lives better.

A primary goal of hack is to not [negatively] impact the developers workflow — especially the REPL; whereby we can edit code and refresh our browser to immediately see changes.

Using Hack

Note: Hack is only included with HHVM 3.0 or nightly builds after March 20th.

Hack support is built-in to HHVM, it is simply an alternative syntax. To use Hack, with HHVM, you just run your hack files in the same way as PHP ones. The opening tag (<?hh) identifies the file as using hack.

In addition to actually running hack files with the HHVM runtime, there are several command line tools for use during development, and migration of existing code bases.

The primary tool is the static analyzer, that will watch your files and report errors in [near] realtime (~200ms). In reality, the static analyzer is Hack.

File Semantics

There are several important changes from PHP that you should know about hack:

  • To promote best practices, you cannot mix Hack with HTML (or other non-code text)
  • All hack files must start with <?hh
  • There is no closing tag for hack files
  • Hack also has multiple modes
  • XHTML can be embedded using XHP, which makes XHTML tags into language constructs

Hack Modes

There are three “modes” in which you can run hack, each of which is denoted by a comment after the opening tag:

Partial Mode

Partial — The default mode. In this mode Hack will ensure that all types are followed, however you are not required to specify types for everything. Additionally, you may call into non-hack code (standard PHP code) without errors.

Because partial is the default mode, it is recommended that you do not specify it via comment:

<?hh

// Code goes here 

Strict Mode

Strict mode will ensure that all type checks are met, and will not let you call into non-Hack code. In strict mode, you must type annotate all code.

While you can include strict files into non-strict files, you cannot go the other way.

In particular, you cannot call classes that are defined in non-Hack code.

Additionally, you cannot have any top-level code, except for require statements. This means all code must be inside classes or functions. Implicitly, it means that at your entry code cannot be strict (e.g. your bootstrap, in Zend Framework 2 lexicon).

As mentioned, modes are enabled by adding a comment after the opening tag:

<?hh // strict

// Code goes here

One other restriction in strict mode, is that regular arrays must be type hinted, and using other collections (Vector, Set, Map, Tuple) are preferred.

While strict mode is probably a great goal, in reality it’s unlikely to happen while migrating legacy code-bases.

Decl Mode

Decl mode, or Declarations mode, means that the type checker will trust any type hints that are specified when calling functions, but will not require them.

<?hh // decl

// Code goes here

Ignoring Errors

In addition to the three modes, you can intentionally disable type checking by marking code as UNSAFE.

This is done by adding an // UNSAFE comment above the code to ignore. This must be uppercase, and is applied from the comment to the end of the current block (typically the next closing brace }).

Using the UNSAFE should be a last resort, as it may result in run-time errors due to the lack of validation.

Type Hints

The major change for Hack is its type-hinting. Both in terms of impact on behavior and syntax changes.

It’s important to understand that all type-hint information is thrown away at run time (this is known as type erasure, and happens in Java with Generics also). This means it has no performance impact compared to PHP at runtime, though it can assist in creating more optimized machine code at compile time.

Hack supports all standard PHP types, as well as a few additions:

Type  
bool Boolean true/false
int Integer numbers
float Floating Point numbers
string Strings
array Arrays (that can be typed)
resource Resources (e.g. file streams)
Class/Interface Name An object type-hint
mixed Any (not recommended)
Vector Numerical contiguously indexed Arrays
Map A typed (both keys and values) associative array
Set An unindexed collection of typed unique values
Tuple A fixed size set of typed values
Pair A fixed size set of typed values restricted to two values, indexed as 0 and 1.

In addition to these, there are two other special types, used for return values only:

  • void — For functions with explicit return
  • this — For methods that return an instance of the object itself; this is late-binding.

Essentially, $this is always valid — that is, whatever is returned must be the same class as $this would be, including in child classes. For example return new static(); would be valid.

Nulls

To denote that an argument is nullable, the type should be preceded with a ?, e.g. ?int or ?\DateTime.

Note that this is not the same as setting a default of null, the argument is still required, it just may be set to null.

<?hh
class DBAdapter {
    public function connect(?string $dsn): ?\PDO
    {
        // $dsn may be null
        //  may return an instance of \PDO or null on error
    }
}

Soft Types

To denote that a type-hint miss-match is not fatal, you precede the type with an @, e.g. @int or @\DateTime.

If you fail to provide the correct type, HHVM will emit a warning, rather than a fatal error:

<?hh
class Calculator {
    public function add(@int $a, @int $b): @int
    {
        // Both $a and $b may not be ints
        // May not return an int
    }
}

$calc = new Calculator();
$calc->add("1", "2");

Running this results in the following warnings:

Warning: Argument 1 to Calc::add() must be of type @int, string given

Warning: Argument 2 to Calc::add() must be of type @int, string given

Strict Mode and Internal Classes

Update
As of HHVM 3.2, hhi files are no longer needed.

When using strict mode, because of the requirement that all code must be in hack, even internal classes are not valid type hints unless you load in an hhi file. These files are bundled with HHVM (although not 100% complete, you can create your own, or make changes), and currently must be copied into your project root.

For example, to allow our DBAdapter class above to work in strict mode, we need to copy /usr/share/hhvm/hack/hhi/stdlib/builtins_pdo.idl.hhi into our project root. This file will load hack-compatible class definitions.

Constructor Argument Promotion

One of the simplest changes added to hack is constructor argument promotion. With this feature, we can short-cut one of the most common operations of defining object properties, and assigning constructor arguments to them.

This is done by dropping the explicit declaration, and simply preceding the argument itself with the visibility keyword. Values passed in will then be automatically assigned to an object property with the same name.

In the example below, we precede our two constructor arguments, $left and $right with both a visibility keyword, protected, and a type int.

Values passed into for these two arguments will then be available as $this->left, and $this->right without an explicit assignment.

<?hh
class Adder {
    public function __construct(protected int $left, protected int $right): void { }

    public function get(): int {
        return $this->left + $this->right;
    }
}

$adder = new Adder(1, 3);
$result = $adder->get(); // 4

Collections

While arrays may be the work horse of PHP, in an effort to be more efficient, and to enforce more data integrity Hack introduces several concrete collection types.

Additionally — except for tuples — these collections can be considered objects, with an API for working with the collection. They also have pass-by-reference semantics, like regular objects.

You can still use standard PHP arrays in hack, although in strict mode they must be typed for all arguments and return values. When creating a new array without explicit keys otherwise, all values must be of the same type — failing to do this will result in a hack error.

<?hh // strict
function createArray(): void {
    $foo = array(1, 2, "3");
}

Will result in the following error from the static analyzer:


<file>:3:5,16: Invalid assignment
  <file>:3:12,16: This array has heterogeneous elements, you should use a tuple instead

While this error recommends a tuple, this will only work if the size and types of your array is immutable (fixed size).

While it is recommended that you use a more appropriate collection type, arrays can be used in strict mode when correctly type hinted, but then can only contain one type of data:

<?hh // strict
function createArray(): array<int> {
    return array(1, 2, 3);
}

There are three ways to hint an array:

  • array — Untyped (only allowed in partial mode)
  • `array<[type]> — [type] typed values, with integer keys
  • `array<[type1], [type2]> — [type1] typed keys, with [type2] typed values.

If you wanted to mimic standard PHP arrays, you could type-hint with array<mixed, mixed>, but this is not recommended.

Tuples

Tuples in hack identical to arrays internally, and while it will (probably) work at runtime, hacks static analyzer will error if you try to insert an invalid type, or into an invalid key. To use a tuple, just replace the array() keyword with tuple().

Tuples have are fixed in size based on their instantiation.

<?hh // strict
function createArray(): void {
    $foo = tuple(1, 2, "3");
    // Do more with the tuple
}

In the above code, errors will be emitted if you try to assign a non-integer to keys 0, or 1, a non-string to key 2, or if you assign anything to any other key.

If you want to type hint a tuple, you must use the tuple literal syntax:

<?hh // string
function createArray(): (int, int, string) {
   return tuple(1, 2, "3");
}

Pairs

Note: (HHVM < 3.2 only) To use Pairs in strict mode, you will need to copy the /usr/share/hhvm/hack/hhi/Pair.hhi file to your project root.

Pairs — as the name implies — can only contain two pieces of data. These have integers keys 0 and 1. Attempting to access, or assign values to any other key will result in an error.

<?hh // strict
function getTask(): Pair<string, string> {
    return Pair { "C039D17D", "checkPing" };
}

Pairs are immutable, however if you store mutable collections or objects within a pair they can be modified.

For example, if you have a Pair containing an object, the objects properties can be modified, but the object itself cannot be removed, or replaced in the Pair.

Vectors

Note: (HHVM < 3.2 only) To use Vectors in strict mode, you will need to copy the /usr/share/hhvm/hack/hhi/Vector.hhi file to your project root.

Vectors can only use integer keys. Also, all keys must be consecutive, starting at 0.

Because vectors can only have integer keys, to type hint a vector, you need only specify the values type.

<?hh // strict
function getCommunityEngineers(): Vector<string> {
    return Vector {"Davey", "PJ", "You?"};
}

Maps

Note: (HHVM < 3.2 only) To use Maps in strict mode, you will need to copy the /usr/share/hhvm/hack/hhi/Map.hhi file to your project root.

Maps are an ordered dictionary, which can have integer or string keys.

Because keys can be of multiple types, we must specify both key and values types when type hinting.

<?hh // strict
function getTags(): Map<string, string> {
    return Map {"php" => "PHP", "hack" => "Hack"};
}

Sets

Note: (HHVM < 3.2 only) To use Sets in strict mode, you will need to copy the /usr/share/hhvm/hack/hhi/Set.hhi file to your project root.

Sets are an unordered collection of unique values. Sets do not have keys, and therefore, like vectors, only need their values type hinted.

Additionally, because sets do not have keys, you cannot access the values by key, nor can you use the foreach ($set as $key => $value) syntax when iterating.

<?hh // strict
function getTags(): Set<string> {
    return Set { "php", "hack", "hhvm" };
}

Immutability

Hack provides immutable variants of Vectors, Maps, and Sets (Pairs are always immutable).

To use these, simply use the Imm* variant, e.g. ImmMap.

Trying to then change the collection will result in an error:

<?hh // strict
function getTags(): ImmMap<string, string> {
    $map = ImmMap {"php" => "PHP", "hack" => "Hack"};
    $map["hhvm"] = "HHVM";
}

Will result in the following errors from the static analyzer:

<file>|13 col 6 error|  You cannot mutate this
<file>|12 col 13 error|  This is an object of type ImmMap

And from HHVM at runtime:

Fatal error: Uncaught exception 'RuntimeException' with message 'Cannot assign to an element of a ImmMap' in <file>

Note: (HHVM < 3.2 only) To use the immutable variabts in strict mode, you will need to copy the appropriate /usr/share/hhvm/hack/hhi/Imm*.hhi files to your project root.

Appending Elements

Standard PHP arrays allow you to append new elements using the square bracket syntax:

<?php
$array = [ "foo", "bar" ];
// Append new value:
$array[] = "baz";
?>

Hack collections on the other hand do not typically allow this.

In the case of Pairs, and Tuples, they do not allow the addition of additional values.

Maps allow you to append using this syntax, but only allow you to assign a Pair. The first value being the desired key, and the second being it’s value. However there is a bug in HHVM 3.0 that causes the static analyzer to show an error — this does work correct at runtime.

<?hh // strict
function appendToMap(): void {
        $m = Map { 0 => "foo", 1 => "bar" };
        $m[] = Pair { 5, "bat" };
        var_dump($m);
}

This will output the following Map:

object(HH\Map)#1 (3) {
  [0]=>
  string(3) "foo"
  [1]=>
  string(3) "bar"
  [5]=>
  string(3) "bat"
}

Vectors work as expected:

<?hh // strict
function appendToVector(): void {
        $v = Vector { "foo", "bar" };
        $v[] = "bat";
        var_dump($v);
}

appendToVector();

With the result being:

object(HH\Vector)#1 (3) {
  [0]=>
  string(3) "foo"
  [1]=>
  string(3) "bar"
  [2]=>
  string(3) "bat"
}

Collection Objects Interface

Except for tuples, all hack collections are built on top of a common object interface.

These interfaces provide both PHP-like functionality, for example Iterable and Traversable are used for iteration, just like the standard SPL interfaces.

Additionally, there are other interfaces for adding/removing and manipulating the contents of collections. This allows you to work with collections as objects, instead of as “better” arrays.

Breaking Iteration

Because collections are objects, they have standard object pass-by-reference behavior. This means that if you iterate over a collection, and modify the same collection the behavior is undefined. HHVM handles this by breaking iteration by throwing an exception:

Fatal error: Uncaught exception 'InvalidOperationException' with message 'Collection was modified during iteration' in <file>

Coming up next…

Now you’ve gotten your feet wet with hack you can begin to take advantage of it’s strong-typing and forced cleanliness to write better code.

In the next installment, we will look at more advanced features of Hack, including shapes, custom types, async functionality, and XHP.

Will you be using hack any time soon? Have you started to code with it already? How do you like it so far? Let us know in the comments.

About Davey Shafik

Davey Shafik is a full time PHP Developer with 12 years experience in PHP and related technologies. A Community Engineer for Engine Yard, he has written three books (so far!), numerous articles and spoken at conferences the globe over.

Davey is best known for his books, the Zend PHP 5 Certification Study Guide and PHP Master: Write Cutting Edge Code, and as the originator of PHP Archive (PHAR) for PHP 5.3.