Introduction

Suffete is a standalone PHP type system, written in Rust, designed to be the type-system core of a static analyzer.

It is not an analyzer. It does not parse PHP, walk control flow, infer types from expressions, or report diagnostics. It is the layer underneath all of those: a representation of every PHP type a real-world analyzer needs to express, plus the operations that move types around the lattice — subtype check, union, intersection, set difference, narrowing under assertions, generic substitution, generic inference.

If you are building an analyzer, suffete answers the questions you would otherwise have to answer yourself, and it answers them with a stable contract: same inputs, same answers, no spooky action across the codebase.

What this book covers

The book is organised in eight parts. Each part stands on its own; you don't need to read them in order, but the earlier parts assume less.

Part I — Foundations — what suffete is, why it exists as a separate crate, a quick guided tour, and the notation we use throughout.
Part II — The Type Universe — every PHP type expressible in suffete: scalars, special elements, objects, arrays, lists, iterables, callables, resources, class-like strings, refinement axes, the Negated and Intersected wrappers, unresolved elements, and templates.
Part III — The Lattice — the operations that relate and combine types: refines, overlaps, meet, join, subtract, narrow. Plus the algebraic laws every operation is required to satisfy.
Part IV — Generics — template parameters, variance, capture-free substitution, the standin inference round, and the inheritance specialisation that resolves what a class supplies to a parameter of one of its ancestors.
Part V — Public API — the Rust crate surface: TypeId, ElementId, TypeBuilder, predicates, inspection, transformation, the World trait, casting, expansion, serialization.
Part VI — Cookbook — recipes that compose the API into the operations a real analyzer needs.
Part VII — Internals — optional reading, for when you want to know how the sausage is made: interning, the bit layout of a handle, the SIMD scans, the performance philosophy.
Part VIII — Reference — terse, scannable reference material: every element kind, every lattice option, every prelude constant.

Who this book is for

You know PHP. You know Rust. You know what a static analyzer is and why one would want a type system to lean on. You may not know much programming-language theory, but you know what a tree is, what a set is, what a partial order is.

The book does not try to teach PL theory from scratch. Where we use a term like "lattice", "subtype", "variance", or "least upper bound", we define it the first time it shows up and assume it after that.

A note on naming

The PL-theory term for the indivisible pieces a type is built from is atom. Suffete code calls them Elements. This book uses Element throughout to match the code, but you will occasionally see "atom" in references to the underlying theory. Treat the words as interchangeable.

The crate also exposes a separately-named type called mago_atom::Atom (re-exported as the atom! macro). That is unrelated — it is an interned string handle (a name pool), not a type-system element.

A note on stability

Suffete is pre-1.0 and the API is unstable. The book documents the API as it stands at the latest commit on main. If you are pinning to a published version, prefer cargo doc --open for the version-correct API; this book is the conceptual reference, not a frozen release artifact.

How to read examples

PHP source is shown as:

function f(int|string $x): bool { /* ... */ }

Rust against the suffete crate is shown as:

use suffete::{TypeBuilder, TypeId, prelude::{INT, STRING}};

let union: TypeId = TypeBuilder::new()
    .push(INT)
    .push(STRING)
    .build();

Mathematical notation uses MathJax. The subtype relation is $\tau \mathrel{<:} \sigma$ ; least upper bound is $\tau \sqcup \sigma$ ; greatest lower bound is $\tau \sqcap \sigma$ ; set difference is $\tau \setminus \sigma$ ; bottom is $\bot$ ; top is $\top$. The full notation table is in the glossary.

Diagrams use mermaid:

graph LR
    Never[never] --> Int[int]
    Never --> String[string]
    Int --> Scalar[scalar]
    String --> Scalar
    Scalar --> Mixed[mixed]

When you find something wrong

If a definition contradicts the code, the code wins and the book is buggy. Please open an issue or a PR; the book lives in book/ of the suffete repository and accepts the same kind of contributions as the code.

What suffete is

Suffete is a Rust crate that implements the PHP type system as a self-contained, queryable data structure with a comprehensive set of operations.

It has three deliverables:

A representation. A handle-based, intern-deduplicated, content-addressed data model that can express every PHP type a real-world analyzer needs to express.
Operations on that representation. Subtyping, overlap, intersection, union, set difference, narrowing, generic substitution, generic inference, expansion of unresolved forms, structural transformations.
A World trait. The single abstraction by which the type system asks questions about the user's codebase: "does class D extend class C?", "what does class C declare for property p?", "what is the upper bound of template parameter T on class C?". An analyzer plugs in its codebase model behind this trait; suffete itself stays codebase-agnostic.

That is the entire surface. There is no parser. There is no AST. There is no notion of a file, a statement, a scope, a control-flow graph, a diagnostic, or a configuration.

What suffete does

Concretely, given two TypeIds and a World, suffete answers:

$\tau \mathrel{<:} \sigma$ — does every value of type $\tau$ also have type $\sigma$? (refines)
$\tau \mathrel{\#} \sigma$ — are $\tau$ and $\sigma$ disjoint? (overlaps)
$\tau \sqcap \sigma$ — what is the greatest lower bound? (meet)
$\tau \sqcup \sigma$ — what is the least upper bound? (join)
$\tau \setminus \sigma$ — what remains of $\tau$ after removing $\sigma$? (subtract)
$\mathit{narrow}(\tau, \pi)$ — given an assertion $\pi$ that holds on top of $\tau$, what is the refined type? (narrow)

It also answers, given a type, simpler structural questions: is this guaranteed truthy? does this contain null? is this a single literal value the analyzer can constant-fold? does this contain a free template parameter anywhere in its tree? — and many more. These are the predicates.

And it offers the build-side primitives: making a new type from elements, walking a type and rebuilding it under a transformation, applying a generic substitution, expanding an alias.

What suffete does not do

It does not parse PHP. There is no lexer, no parser, no docblock interpreter. The analyzer brings types in from somewhere — a parser, a serialised cache, a hand-built fixture — and constructs TypeIds through suffete's builders and prelude.

It does not run a control-flow analysis. The lattice operations are pure functions of their inputs; they do not know which line of code they came from, what branch they are on, or which assertions led to them being asked. The analyzer asks; suffete answers.

It does not produce diagnostics. When refines returns false, the analyzer chooses what message to display, where to point the user, and at what severity. Suffete returns a boolean and a LatticeReport carrying structured side information; the message-writing is the analyzer's job.

It does not maintain a codebase model. Class hierarchies, declared properties, declared methods, template parameter bounds — all of those live in the analyzer's data structures and are queried through World on demand.

What this buys you

Three things, in roughly decreasing order of value to a downstream consumer:

Single source of truth for hard semantics. Subtyping in PHP is not simple. The interaction of generics and intersections, of nullability and the truthiness axes on mixed, of literal collapse with refined ranges, of object shape with HasMethod — these are conditions that every analyzer has to get right and few of them do. If suffete is correct, every analyzer that depends on suffete is correct on those conditions, automatically.

A property-test battery, run for everyone. Suffete is verified against an algebraic-law battery: idempotence, commutativity, associativity, identity, absorption, the GLB and LUB bounds, the soundness interlock between the operations. Every PR runs that battery. When suffete says $\tau \sqcap \sigma$ is some value, you can rely on it being a lower bound — that is checked on thousands of cases per CI run.

Performance you don't have to think about. The interner, the SIMD scans, the canonical-form fast paths, the singleton caches — they are tuned, and they run on every analyzer that depends on suffete, without the analyzer needing to know how. You write refines(t1, t2, world, opts, &mut report); it returns in nanoseconds for typical inputs.

How it relates to the rest of Carthage

Suffete is being designed in isolation so that the type-system contract can be specified, exercised, and benchmarked without an analyzer in the loop. Once it is stable, it is intended to replace the type-system core inside Mago, the Carthage Software PHP toolchain.

Mago is one consumer. Anyone else writing a PHP analyzer in Rust is invited to be another. Suffete will not change its API on Mago's behalf; if it works for Mago, it works for everyone.

See also: Why a separate type system — the rationale for not building this inside an analyzer in the first place.

Why a separate type system

The PHP type system is a serious piece of work. It is not as small as it looks from the docs. There are at least a dozen design decisions a real-world analyzer has to make about how types interact, and most of them are the difference between a useful tool and a frustrating one.

This chapter is the case for solving those problems exactly once.

The problem with bundling

Most static analyzers grow a type system organically. The analyzer is built first, the type system is whatever the analyzer needed at the moment, and the seams between the two stop existing. Three things happen.

Specification rots. The type system never becomes a thing that can be written down; its rules live in the analyzer's source as comments-near-callsites and inferred-from-tests. New contributors patch existing rules without seeing the whole; the rules drift; nobody has a complete picture; and the only way to find out what the rules are is to read every callsite.

Cross-tool divergence. If you and your colleagues use a different analyzer for different reasons — one for IDE feedback, one for CI, one for refactoring — the answers you get to the same question differ. Each analyzer disagrees with the others on at least some of: literal-int range merging, the truthiness collapse on mixed, what happens to HasMethod under intersection with a class that lacks it, whether int refines float, whether a sealed shape with all-optional keys equals an empty shape, what variance Traversable<K, V> has on each parameter. In an ideal world all three tools agree. In practice they don't, and there is no shared spec to negotiate against.

Performance is invisible. The type system is sprinkled throughout the analyzer. There is no single hot loop to optimise. The cost of a subtype_of call is one HashMap allocation here, one clone() there, one quadratic loop somewhere else, none of which look bad locally. The only people who can move the needle are the ones with the entire architecture in their head, and they are usually busy.

The bet

The bet suffete makes is that all three of those problems get easier if the type system is outside the analyzer.

Specification becomes the artifact. You can run property tests on it. You can write a book about it. You can argue about whether a particular law should hold without simultaneously arguing about which line of which analyzer file should change.

Convergence becomes possible. Two analyzers that share the same type-system crate cannot disagree on the type-system answers. They can still disagree on flow analysis, on diagnostic policy, on what the user actually meant — those are analyzer concerns. But "is int<0,10> a subtype of int?" has one answer in Rust, and every consumer gets it.

Performance becomes a single target. The hot loops are visible. The benchmarks are runnable. The optimisations live in one place and benefit everyone.

What this costs

There are three real costs.

An API surface. Bundled type systems have no API; they have callsites. A standalone type system has a public Rust API that is permanent in the sense that breaking it breaks every consumer at once. Suffete acknowledges this: the API is currently unstable, broken often, and will stay that way until the design has settled. Once stable, breaking it requires a major version.

A World indirection. The type system needs to ask the analyzer about the codebase: class hierarchies, declared methods, template parameter bounds. Bundled type systems can reach directly into the analyzer's HashMaps. Suffete cannot; it goes through the World trait. The analyzer pays for one virtual dispatch per query. Suffete amortises this with caching where it can and goes for free wherever the question can be answered without the world.

Vocabulary alignment. The analyzer and the type-system crate have to agree on what a type is. If the analyzer wants to express a type the crate cannot represent, the analyzer is stuck. The mitigation is the completeness mandate: any PHP type that a real-world analyzer expresses — Mago, PHPStan, Psalm, Hack — must be representable in suffete. If you find one that isn't, that is a bug.

Why now, why Rust

Mago is a Rust toolchain. Doing the same work in two languages would be untenable; doing it in not-Rust would lose Mago's performance baseline. The same reasoning that pulled mago into Rust pulls suffete with it.

Beyond Mago, the long-term story is that other Rust-based PHP tools — IDE servers, refactoring tools, test runners that need to reason about types — should not have to re-implement this. There aren't many of them today. Suffete bets there will be more, and that they would all rather build on top of a settled spec than write their own.

See also: What suffete is — the surface area suffete deliberately covers (and what it deliberately doesn't).

Quick tour

This chapter walks through suffete by example. It is not a tutorial — the assumed audience already knows what a type system does — but a 10-minute look at the API surface so the rest of the book has a concrete shape to point at.

We will build types, ask the lattice questions about them, narrow under an assertion, and substitute through a generic. Every snippet in this chapter compiles against the suffete crate at HEAD.

Setting up

use suffete::{
    TypeBuilder, TypeId,
    prelude::{INT, STRING, NULL, FALSE, TYPE_MIXED, TYPE_NEVER},
    lattice::{self, LatticeOptions, LatticeReport},
    world::NullWorld,
};

TypeBuilder is the mutable builder for a type. prelude::* exposes the well-known constants — INT is the ElementId for the unconstrained integer; STRING is the unconstrained string; TYPE_MIXED is the TypeId of the universal top; TYPE_NEVER is the TypeId of the empty bottom. NullWorld is the trivial World that knows nothing about classes ; useful for examples that don't touch the codebase.

Constructing a union

The PHP type int|string is a two-element union:

let int_or_string: TypeId = TypeBuilder::new()
    .push(INT)
    .push(STRING)
    .build();

TypeBuilder::build interns the result, sorting and deduplicating the elements. Call it twice with the same elements (in any order) and you get the same TypeId back — handle equality is content equality.

let a = TypeBuilder::new().push(INT).push(STRING).build();
let b = TypeBuilder::new().push(STRING).push(INT).build();
assert_eq!(a, b);

Asking the lattice

Is int|string a subtype of int|string|null? Yes:

let nullable = TypeBuilder::new().push(INT).push(STRING).push(NULL).build();

let world = NullWorld;
let opts = LatticeOptions::default();
let mut report = LatticeReport::new();

assert!(lattice::refines(int_or_string, nullable, &world, opts, &mut report));

The reverse direction does not hold — null is not in int|string:

assert!(!lattice::refines(nullable, int_or_string, &world, opts, &mut report));

refines returns a boolean; the &mut LatticeReport collects structured side information about why a particular answer was reached (coercion edges, template defaults, etc.). For most questions you will ignore the report.

Overlap and disjointness

refines is one-directional. To ask "is there any value in both types?" use overlaps:

let int_only = TypeBuilder::new().push(INT).build();
let string_only = TypeBuilder::new().push(STRING).build();

assert!(!lattice::overlaps(int_only, string_only, &world, opts, &mut report));
assert!(lattice::overlaps(int_or_string, int_only, &world, opts, &mut report));

Meet, join, subtract

The three combinators return TypeId:

// meet (greatest lower bound, ⊓): the values both types share.
let common = lattice::meet(int_or_string, nullable, &world, opts, &mut report);
// common == int|string

// join (least upper bound, ⊔): the smallest type containing both.
let either = lattice::join(int_only, string_only, &world, opts, &mut report);
// either == int|string

// subtract: what remains after removing the second from the first.
let only_string = lattice::subtract(int_or_string, int_only, &world, opts, &mut report);
// only_string == string

Narrowing under an assertion

Suppose you have a value of type int|string|null and the analyzer has just observed an assertion that excludes null. The result is int|string:

let after_null_check = lattice::narrow(
    nullable,         // input type
    int_or_string,    // assertion: the value is one of these
    &world, opts, &mut report,
);
// after_null_check == int|string

Narrowing is the operation that consumes the assertions an analyzer extracts from if, instanceof, comparisons against constants, and so on.

Predicates

For top-level structural questions — "is every element in this type guaranteed truthy?", "does this type contain any object element?", "is this type a single literal that can be constant-folded?" — you call into the predicates module:

use suffete::predicates::{is_truthy, contains_null, is_constant_foldable};

assert!(!contains_null(int_or_string));
assert!(contains_null(nullable));

let one = TypeBuilder::new().push(suffete::ElementId::int_literal(1)).build();
assert!(is_truthy(one));
assert!(is_constant_foldable(one));

These do not need a World; they ask only about the structure of the type as you've handed it to them.

A generic, briefly

Generics get their own Part IV. One snippet to show the shape:

use suffete::{ElementId, template::substitute};
use suffete::element::payload::GenericParameterInfo;

// Suppose `T` is bound on class `Box<T>`. Substitute `int` for it everywhere
// in `T|null`.
let t = ElementId::generic_parameter("T", "Box", /* upper bound: */ TYPE_MIXED);
let t_or_null = TypeBuilder::new().push(t).push(NULL).build();

let int_t = TypeBuilder::new().push(INT).build();
let result = substitute(t_or_null, &|info: &GenericParameterInfo| {
    if info.name.as_str() == "T" { Some(int_t) } else { None }
});
// result == int|null

The substitution is capture-free. Recursion into nested types (object type-args, callable parameters, conditional then/else, etc.) is handled by the walker; you supply only the leaf decision.

Where to look next

If you want the data model first: Part II — The Type Universe starts with the element kinds and works through every payload variant.

If you want the operations first: Part III — The Lattice covers refines, overlaps, meet, join, subtract, narrow in order.

If you want to start writing code: Part V — Public API is the by-module API reference, and Part VI — Cookbook shows the API composed into common analyzer recipes.

See also: What suffete is and Glossary and notation.

Glossary and notation

The book uses a small but consistent set of mathematical and PL-theory terms. They are defined here once. Subsequent chapters use them without re-defining.

Notation

Symbol	Meaning
$\tau, \sigma, \rho$	Range over types.
$T, U, V$	Range over template parameters.
$\Gamma$	The program environment: the function from class, interface, trait, enum, function, and constant names to their declarations. In the API, this is the `World` trait.
$\Theta$	A template environment: a partial function from template parameters (qualified by their defining entity) to types.
$\Delta$	A defining entity: a class-like, function, method, or closure that introduces template parameters.
$\tau \mathrel{<:} \sigma$	$\tau$ refines $\sigma$: every value of type $\tau$ is also a value of type $\sigma$.
$\tau \equiv \sigma$	$\tau \mathrel{<:} \sigma$ and $\sigma \mathrel{<:} \tau$.
$\tau \sqcup \sigma$	Join: the least upper bound (union, with absorption rules applied).
$\tau \sqcap \sigma$	Meet: the greatest lower bound (intersection, with literal/range collapse).
$\tau \setminus \sigma$	Set difference: values in $\tau$ that are not in $\sigma$.
$\tau \mathrel{\#} \sigma$	$\tau$ and $\sigma$ are disjoint: $\tau \sqcap \sigma \equiv \bot$.
$\tau \Rightarrow \sigma$	A coercion edge: an admissible non-subtype edge in non-strict positions. PHP's runtime, on the boundary, allows e.g. `int` to flow into a `float` parameter. The coercion edge is recorded on the `LatticeReport`.
$\bot$	Bottom: the empty type, no values. PHP's `never`.
$\top$	Top: the universal type, all values. PHP's `mixed`.
$\Gamma \vdash D \prec C$	$\Gamma$ records that class-like $D$ extends, implements, or uses-as-trait $C$ (transitively).
$\sigma[T \mapsto \rho]$	Capture-free substitution of $T$ by $\rho$ in $\sigma$.
$\sigma\Theta$	Simultaneous substitution of $\sigma$ under the environment $\Theta$.
$\Theta_1 \circ \Theta_2$	Composition of template environments.
$\mathit{tparam}_C$	The indexed list of template parameters of class $C$, in declaration order.
$\mathit{ext}_{D \to C}$	The extension binding: the type arguments $D$ supplies to $C$ along the inheritance chain.
$\mathit{specialise}(C, T, D\langle\bar\rho\rangle)$	The type of $C$'s template parameter $T$ in the context of an instantiated descendant $D\langle\bar\rho\rangle$.
$\mathit{expand}(\tau)$	Resolution of non-structural forms in $\tau$ (aliases, references, derived types, conditionals).
$\mathit{standin}(\sigma, \rho, \Theta)$	An inference round: walk parameter $\sigma$ against argument $\rho$, accumulating bounds into $\Theta$.
$\mathit{infer}(\sigma, \Theta)$	Inferred-replacement of templates in $\sigma$ using a fully-determined $\Theta$.
$\mathit{narrow}(\tau, \pi)$	Apply the assertion $\pi$ to refine the type $\tau$.

Vocabulary

The terms below are used throughout the book in their PL-theory sense. Where a term has a more colloquial meaning that might confuse, the technical sense is noted.

Atom. The PL-theory name for the indivisible piece a type is built from. Suffete calls these Elements. The two terms are interchangeable; this book prefers "Element" everywhere except in references to literature.

Element. The indivisible unit a type is built from: a single integer literal, the unconstrained string, the named class Foo, the negation of null. See Elements.

Type. A set of values, expressed as a finite union of Elements. A singleton Element is a one-element type.

Element kind. The family or shape of an Element — int, string, named-class object, list, negation, intersection, conditional, and so on.

Refines / refinement / subtype. "Refines" is the verb form of "is a subtype of". $\tau$ refines $\sigma$ ($\tau \mathrel{<:} \sigma$) iff every value of $\tau$ is also a value of $\sigma$. "Refinement" is also used for the refinement axes that can be carried on certain element kinds (the truthiness, non-empty, non-null, isset-from-loop axes on mixed; the casing/numeric/non-empty axes on string); context disambiguates.

Overlap / disjoint. Two types overlap if their meet is inhabited (contains at least one value). They are disjoint if their meet is empty (uninhabited).

Inhabited / uninhabited. A type is inhabited if it contains at least one value. $\bot$ is the canonical uninhabited type, but some sealed shapes (e.g. an empty sealed object shape with required properties) and some Intersected wrappers (e.g. Foo & !Foo) are also uninhabited despite not being syntactically $\bot$.

Meet ($\sqcap$). The greatest lower bound of two types. Operationally: the largest type that refines both. PHP's intersection.

Join ($\sqcup$). The least upper bound of two types. Operationally: the smallest type that both refine into. PHP's union, with literal collapse and range merging applied.

Subtract ($\setminus$). Set-theoretic difference: $\tau \setminus \sigma$ is the values in $\tau$ that are not in $\sigma$. Used by the analyzer to eliminate a positive case (e.g. after !is_int($x)).

Narrow. The composite operation that combines an input type with an assertion type. Often equivalent to meet (positive assertions) or subtract (negative assertions) but with extra rules for cross-cutting concerns like the truthiness axes. See narrow.

Lattice. The set of types ordered by $\mathrel{<:}$, equipped with $\sqcap$ and $\sqcup$ as meet and join, with $\bot$ at the bottom and $\top$ at the top. The PHP type lattice is not a complete lattice in the strict mathematical sense — there are infinite ascending chains in literal-int ranges, for example — but the operations behave as a lattice does on the cases that matter.

Variance. The relationship between a generic parameter and the substitution direction. Covariant: a subtype of T in a subtype of Box<T>. Contravariant: a supertype of T. Invariant: only T itself. See variance.

Substitution. Replacing a template parameter with a concrete type, capture-free. Written $\sigma[T \mapsto \rho]$. See substitute.

Inference (standin). A round of bound-collection: walking a parameter type against an argument type and recording, for each free template, the bounds the argument imposes. Repeated inference rounds against multiple call-site arguments accumulate into a template environment $\Theta$. See standin.

Inferred-replacement. Applying a fully-determined template environment to a type to produce the inferred concrete result. Sometimes called "infer". See standin.

Inheritance specialisation. Computing what a descendant supplies to an ancestor's template parameter, given the descendant's instantiation. See specialise.

Expansion. Resolving the non-structural forms — aliases, references, member references, global references, conditionals, derived types — into structural ones. The lattice operations work on already-expanded inputs in most cases; expansion is the analyzer's responsibility. See expand.

Wrapper. A composite Element that contains other Elements: a negation (set complement) and an intersection (head plus a list of conjuncts). Wrappers exist in the Element universe; the union itself is in the Type universe. See wrappers.

Refinement axis. A boolean or low-cardinality property carried on top of an Element. The narrowed-mixed kind carries non-null, truthiness, empty, and isset-from-loop axes. The string kind carries casing, numeric-shape, non-empty axes. Axes never multiply the element count — they are bits on the same Element. See refinement axes.

Coercion edge. A non-subtype, non-overlap relationship that PHP's runtime allows on the parameter boundary. The classic example is int flowing into a float parameter. The lattice records the use of a coercion edge so the analyzer can warn about it.

Sealed. A keyed-array, list, or object shape is sealed when it asserts no entries beyond those it lists. Unsealed shapes admit additional entries.

Defining entity. The class-like, function, method, or closure that introduces a template parameter. Every template parameter is keyed by (name, defining_entity) so substitutions know which parameters they apply to.

World. The analyzer's codebase model, exposed to suffete as the source of class hierarchies, declared methods, declared properties, template parameter bounds. Suffete itself stores none of this.

See also: Quick tour for the notation in action, and Element kinds for the exhaustive list of element-kind names referenced throughout the book.

Elements: the indivisible types

Every type in suffete is a finite union of Elements. An Element is the smallest unit a type is built from — a single integer, the unconstrained string, the named class Foo, the negation of null. A type with one Element is a singleton; a type with several is a union of those.

The PL-theory term for an Element is atom. The two terms are interchangeable; the book uses "Element" to match the code. (See the glossary and the introduction note on naming.)

The split

A type lives at two levels:

Element — one indivisible piece. int is an Element. null is an Element. list<int> is an Element. Foo & Bar is an Element (an intersection wrapping a head and a conjunct).
Type — a union of Elements, plus a small bag of flow-flag metadata. int|string is a type with two Elements. list<int> is a type with one Element.

Singletons are still types: int (the type) is a one-element type containing the int Element. The Type/Element distinction is what lets the union machinery be straightforwardly set-of-elements without every operation having to handle "what if this is one element" specially.

graph TD
    T["Type<br/>(union of Elements)"]
    T --> E1["Element #1"]
    T --> E2["Element #2"]
    T --> E3["Element #3<br/>(may itself wrap further<br/>Elements via negation/intersection)"]

Element kinds, by family

The universe groups into families. The next chapters of Part II cover each family in detail; this section is a one-screen index.

Scalars

The PHP scalar primitives, plus their unions and refinements.

int, float, string, bool, true, false
numeric (the disjoint union int | float | numeric-string)
scalar (the disjoint union bool | int | float | string)
array-key (the disjoint union int | string)
class-string, interface-string, enum-string, trait-string

See scalars and class-like strings.

Special elements

The well-known landmarks that are not values themselves but anchors in the lattice.

null (the value null)
void (the absence of a return; falsy and null-equivalent at value sites — a void function returns null to its caller, so void collapses to null in any non-degenerate union)
never (the empty type, $\bot$ — a function that never returns has return type never)
mixed (the unconstrained universe top, $\top$, plus its narrowed variants carrying truthiness/non-null/non-empty axes)
placeholder (an inference hole)

See special elements.

Object family

The forms that describe an object reference at runtime.

Named classes (Foo, Foo<int>, with optional static / $this modality)
The unconstrained object
Enums (whole Status, or a single case Status::Active)
Structural shapes (object{name: string, age: int})
has-method<m> and has-property predicates

See objects, enums, and structural object types.

Array / list family

PHP's two array views, in their generic and shape forms.

Keyed arrays: array<K, V>, sealed array{a: int, ...}, the empty array{}
Lists: list<T>, non-empty-list<T>, sealed list{0: int, ...}

See arrays and lists.

Iterable / callable

iterable<K, V>
callable, plus signatures and closures

See iterables and callables.

Resources

resource, open-resource, closed-resource, plus optional named kinds (resource<curl>)

See resources.

Wrappers

The two combinators that contain other Elements.

Negation: $\neg \tau$
Intersection: $H \sqcap C_1 \sqcap \dots \sqcap C_n$

See wrappers.

Unresolved elements

Forms that name a future resolution rather than carrying its result. Lattice operations on these route through expansion.

Aliases (a type alias, name-only)
References (T referenced by name, before binding)
Member references (Foo::T)
Global references
Conditional types (T extends U ? X : Y)
Derived types (key-of<T>, value-of<T>, T[K], ...)
Inference variables (analyser-introduced placeholders)

See unresolved elements.

Generics

Free template parameters (T inside a generic class declaration, with its constraint, defining entity, and qualifier)

See templates and Part IV.

Closure

The Element universe is closed under the lattice operations in the following sense: every well-formed application of meet, join, subtract, or narrow to two types produces another well-formed type. The result may be $\bot$ (never) or $\top$ (mixed), or it may be a wrapper (negation / intersection) — but it is always a type that can itself be the input to further operations.

The unresolved kinds break this closure on the lattice side: the lattice is not directly defined on aliases, references, conditionals, etc. The contract is that those are expanded out before the lattice is asked anything important. See expand for the operation, and unresolved elements for the kinds.

A worked example

The PHP type:

non-empty-list<int>|null

is a type with two Elements: the non-empty-list<int> element and the null element. The order in which they were constructed is irrelevant; the universe canonicalises unions so that any two paths to the same value-set produce the same type.

See also: Element kinds for the exhaustive list with one-line descriptions; Constructing types for the API used to build them.

Scalars

The scalar family is PHP's primitive value space: integers, floats, strings, booleans, plus the cross-cutting unions numeric, scalar, and array-key. Every scalar admits refinement: a literal value, a bounded range, a string with declared casing or non-empty flag, a boolean specialised to true or false.

Integers

int is the family for some integer. Four refinement levels:

PHP-side	Denotes
`int`	Any integer.
`literal-int`	Some integer literal (the value is unknown, but the analyser knows it is a literal, not a computed result).
`int(7)`, `7`	The single integer `7`.
`int<0, 100>`, `int<-∞, 0>`, `int<0, ∞>`	A closed integer range (bounds may be open on either side).

Subtyping among `int` variants

graph BT
    L7["int(7)"] --> R0_10["int<0, 10>"]
    L7 --> UL["literal-int"]
    R0_10 --> Pos["int<0, ∞>"]
    R0_10 --> U["int"]
    Pos --> U
    UL --> U

int(n) refines int<lo, hi> iff $n \in [lo, hi]$.
int<lo', hi'> refines int<lo, hi> iff $[lo', hi'] \subseteq [lo, hi]$.
Every int variant refines plain int.
int(n) refines literal-int; literal-int refines int.

Range arithmetic

The lattice operations on ranges are interval arithmetic with optional infinite bounds. Meet of two ranges is their intersection (becomes never if disjoint); join is the smallest range containing both. Ranges of one literal collapse to a literal; ranges that grow past a threshold widen to int.

Floats

Three refinement levels:

PHP-side	Denotes
`float`	Any float.
`float(3.14)`, `3.14`	The single float literal. `0.0` and `-0.0` are distinct under PHP's identity comparison.
(analyser-internal)	A float that is provably non-zero. Used by the truthiness machinery rather than appearing in user-written types.

There is no PHP-side range form for floats — the precision questions involved (denormals, NaN, signed zero) make ranges more trouble than they are worth.

Strings

The richest scalar. A string carries a literal slot plus a small bag of refinement flags:

Flag	PHP-side	Meaning
`non-empty`	`non-empty-string`	The string is at least one character.
`truthy`	`truthy-string`	Truthy at runtime (excludes `""` and `"0"`).
`lowercase`	`lowercase-string`	Every character is lowercase.
`uppercase`	`uppercase-string`	Every character is uppercase.
`numeric`	`numeric-string`	Parses as an integer or float.

These are axes — combinations are valid: non-empty-lowercase-string carries both bits.

Subtyping among `string` variants

graph BT
    L["literal 'foo'"] --> NEL["non-empty-lowercase-string"]
    NEL --> NES["non-empty-string"]
    NEL --> LS["lowercase-string"]
    NES --> S["string"]
    LS --> S

A literal refines an unspecified string with flags iff the literal value satisfies every flag (e.g. literal "foo" refines non-empty-lowercase-string).
Unspecified-with-flags $F'$ refines unspecified-with-flags $F$ iff $F \subseteq F'$ (more flags = stricter).
truthy implies non-empty (any non-"" non-"0" string is both).

Numeric strings

numeric-string is the dominator that takes a string into the same family as int and float. It has two refining ancestors: string (every numeric-string is a string) and numeric (every numeric-string is numeric). The lattice handles both directions.

Booleans

Three landmarks:

PHP-side	Denotes
`true`	The value `true`.
`false`	The value `false`.
`bool`	Either of `true` or `false`.

Subtyping:

true refines bool.
false refines bool.
true and false are disjoint.

The lattice canonicalises true | false to bool (join), and decomposes bool into true | false when subtracting (e.g. bool \ true = false).

The cross-cutting unions

Three forms are true unions — disjoint covers over multiple primitive families, treated as a single name for compactness, and decomposed by the lattice when a question requires it.

`scalar`

$$\mathit{scalar} = \mathit{bool} \cup \mathit{int} \cup \mathit{float} \cup \mathit{string}$$

Every bool, int, float, or string refines scalar.

`numeric`

$$\mathit{numeric} = \mathit{int} \cup \mathit{float} \cup \mathit{numeric{-}string}$$

Same dominator behaviour as scalar, but over the smaller cover. numeric is not a subset of scalar directly — they overlap on int and float but numeric-string is a refinement of string rather than a direct member of scalar.

`array-key`

$$\mathit{array{-}key} = \mathit{int} \cup \mathit{string}$$

Used in PHP's array keys and as the bound on most generic key parameters. Subsumes every int and string variant.

A worked subtype query

function f(int|string $x): bool { /* ... */ }

f(7);            // OK:   int(7) <: int|string
f("hello");      // OK:   literal "hello" <: int|string
f(3.14);         // FAIL: float is disjoint from int|string

See also: Refinement axes for the detail on string axes and the analogous mixed axes; class-like strings for the strings that name classes; refines for the subtype rules.

Special elements

A handful of Element kinds are not values in the ordinary sense; they are landmarks. They mark the bottom and top of the lattice, the absence of a value, the presence of a value that has not yet been resolved, and the universe top with optional axis refinements layered on top. They have outsized impact on the lattice rules, so this chapter treats them in one place.

`never` ($\bot$)

The empty type. No values inhabit never. PHP-side: the return type of a function that never returns (always throws or exits, an infinite loop).

The bottom of the lattice. Refines every type.
Disjoint from every type including itself — there is no value that inhabits never, so $\mathit{never} \sqcap \mathit{never} = \bot$ and $\mathit{overlaps}(\mathit{never}, \mathit{never}) = \mathit{false}$.
$\mathit{never} \sqcup \tau = \tau$ for every $\tau$. The join unit.
$\mathit{never} \sqcap \tau = \mathit{never}$ for every $\tau$. The meet annihilator.

`mixed` ($\top$)

The universal top. Some value, no constraints. PHP-side: mixed. It is the type every value-bearing type refines.

mixed carries optional refinement axes that narrow it without changing its kind:

is_non_null — the value is not null. PHP-side: non-null mixed.
truthiness — Truthy, Falsy, or undetermined. PHP-side: truthy mixed / falsy mixed.
is_empty — analyser-internal marker for "the value is falsy".
is_isset_from_loop — analyser-internal marker for "this value flowed through a loop body".

A narrowed mixed (one or more axes set) sits between vanilla mixed and the kinds that imply the axis. non-null mixed sits above every non-null-bearing kind.

graph BT
    Int[int] --> NN["non-null mixed"]
    Str[string] --> NN
    True_[true] --> Truthy["truthy mixed"]
    Truthy --> NN
    NN --> Mixed[mixed]
    Null --> Mixed
    Void --> Mixed

The short version of the rules: a type refines mixed-with-axes iff every axis the container constrains is implied by the input. int refines non-null mixed because int is structurally non-null. string refines truthy mixed only when the string is also known truthy.

`null`

The single value null.

Falsy. Disjoint from every kind that asserts non-null (most kinds — int, string, Foo, etc. — and non-null mixed).
$\mathit{null} \sqcup \tau$ produces a nullable type when $\tau$ does not already admit null.
The lattice's narrow machinery recognises is_null($x) / !is_null($x) assertions specifically.

`void`

The absence of a return value. PHP-side: a function declared with return type void. The runtime returns the value null from such a function, so at every value-flow site void is observationally null ; the distinction is purely a declarative constraint on the function body (a : void return forbids return $expr).

Falsy. Does not refine non-null mixed (because the runtime value is null).
Preserved alone: a singleton union containing only void round-trips as void, so a : void return-type annotation survives interning.
Canonicalised to null in any non-degenerate union: void | T (for any value-bearing T) becomes T | null ; void | null becomes null ; void | never becomes void (the never is dropped first, leaving void alone). This matches PHP runtime semantics: callers of a void function see null, never a phantom void value.

In the original mago bug that prompted this rule, a match arm produced true | void and the analyser collapsed it to true, dropping the null branch entirely. The correct answer is true | null, which is what the lattice now produces.

`placeholder`

The inference-time hole. Marks a position where the analyser knows a type will exist but has not yet committed to one. Conceptually _ in TypeScript or ? in some inference contexts.

The lattice treats placeholder as $\top$ for the purposes of refines and overlaps ; everything refines placeholder and placeholder refines everything. This keeps a partially-inferred type from blocking the analysis until inference completes.
Should not appear in a finalised type. If you see it after expansion, the analyser failed to commit.

Summary

Element	Inhabited?	Position in lattice
`never`	no	$\bot$
`mixed` (vanilla)	yes (every value)	$\top$
`mixed` (narrowed)	yes	between $\top$ and the kinds that imply the axis
`null`	yes (one value)	strict subtype of `mixed`
`void`	yes (the runtime value `null`)	strict subtype of `mixed`; preserved alone, canonicalised to `null` in any union of length > 1
`placeholder`	by convention, treated as $\top$	inference-only

Why `mixed` carries axes instead of being wrapped

A natural alternative would be to express narrowed mixed as mixed & non-null ; an Intersected of mixed and a hypothetical NonNull element. Suffete instead puts the axes directly on mixed for two reasons:

No rule duplication. With axes attached to the type, the mixed-family rules are local. With a wrapper, every other type would need to know how to interact with that wrapper.
Compactness. non-null truthy mixed is one type carrying two bits, not a chain of three nested wrappings.

The trade-off is that mixed has special rules in the lattice. Those rules are tested against the algebraic-law battery so the elaboration does not introduce soundness drift.

See also: Refinement axes for the full semantics of the mixed and string axes; refines for the subtype rules; glossary for the relation $\bot$ to never and $\top$ to mixed.

Objects, enums, and structural object types

PHP objects can be referenced six ways. Five describe a runtime object; one (has-method/has-property) describes only a structural property the runtime object must have.

PHP-side	Denotes
`Foo`, `Foo<int>`, `Foo & Bar`	A named class, with optional template arguments.
`object`	Any object, no class commitment.
`Status`, `Status::Active`	A backed or pure enum, possibly narrowed to a single case.
`object{name: string, age: int}`	A structural object: any object that has these properties.
`has-method<save>`	Anything that declares method `save`.
`has-property<id>`	Anything that declares property `id`.

This chapter describes what each means. The lattice rules (subtyping, meet, etc.) live in refines and the operation chapters.

Named classes (`Foo`)

A reference to a specific class, optionally instantiated with template arguments and/or marked with a late-static modality (static, $this).

Type arguments

When a class is generic (@template T), the user supplies the arguments at the use site. If the class declares more parameters than the user supplies, the missing ones are filled with their declared upper bounds and marked as default-filled so downstream variance checks know to skip them. If the user supplies more than the class declares, the surplus is dropped (truncated to the declared arity). This collapses Box<int, string> and Box<int> to the same form when Box declares one parameter.

Modality flags

Three flags govern the late-static modality:

is_this — $this (the receiver). Strongest.
is_static — the late-static class. Weaker than is_this.
remapped_parameters — analyser-internal marker for an inheritance-binding remap.

is_this implies is_static. The modality check at refinement time:

A container marked is_this accepts only is_this inputs.
A container marked is_static accepts is_static or is_this inputs.
A plain Foo accepts any of the above.

Intersections

Foo & Bar is expressed via the Intersected wrapper, with Foo as head and Bar as a conjunct.

`object` (the unconstrained form)

The "some object, class not committed". Used when the analyser knows the value is an object but has not pinned down a class. Refines mixed; refined by every named class, enum, structural shape, and has-* form.

Enums

PHP enums can be backed (enum Status: string { case Active = 'active'; ... }) or pure (enum Color { case Red; ... }). The type system models them with two forms:

Status — the whole enum (any case).
Status::Active — the single case.

A vanilla Status is the join of all its cases. A specific case is a strict subtype of the enum. Two cases of the same enum are disjoint.

An enum case behaves structurally like an object shape with a name property (always present) and a value property (present only on backed enums). The lattice synthesises that shape on demand, so Status::Active refines object{name: string} automatically.

Structural object types (`object{...}`)

A structural object type is a list of declared properties (each with a name, a type, and an optional flag) plus a sealed bit.

Sealed vs unsealed

An unsealed shape object{name: string, ...} is satisfied by any object that has at least the listed properties (with refining types). Extra properties are allowed.
A sealed shape object{name: string} is satisfied by an object that has exactly those properties.

The sealed-vs-sealed rule mirrors the keyed-array rule (see arrays): every required key in the container must be present in the input shape with a refining value, and the input cannot introduce required keys the container does not list.

Optional properties

A property can be marked optional (name?: string). Required is a strict subtype of optional ; the input is more committal.

Intersections

object{x: int} & has-method<draw> is the Intersected wrapper with the shape as head and has-method<draw> as conjunct.

`has-method<m>`

PHP-side: has-method<m>. Denotes "anything that declares a method called m". No commitment to a specific class.

A refines has-method<m> iff one of:

A is has-method<m> itself (the same method name).
A is a named class C and the analyser confirms C declares (or inherits) m.
A is an enum and the enum declares m.

The lattice asks the analyser's codebase model. If it does not know, the conservative answer is "no" ; better to over-reject than to under-reject.

Multiple has-method constraints chain through the Intersected wrapper: has-method<save> & has-method<delete>.

`has-property`

Symmetric to has-method, with p as the property name. The lattice handles the value property specially for backed enums (see above). Beyond that, the rules are the same.

How the kinds compose

graph BT
    HM["has-method&lt;save&gt;"] --> O[object]
    HP["has-property&lt;id&gt;"] --> O
    Shape["object{x: int}"] --> O
    Foo["Foo"] --> O
    StatusActive["Status::Active"] --> Status["Status (enum)"]
    Status --> O
    O --> Mixed[mixed]

Every member of the object family refines object; object refines mixed.

Within the family:

Foo refines object{x: int} if the analyser records that Foo has a property x whose declared type refines int.
Foo refines has-method<m> if the analyser records that Foo (or an ancestor) declares m.
object{x: int, y: string} refines object{x: int} (sealed-vs-unsealed: extra key allowed).
object{x: int} does not refine object{x: int, y: string} ; the input is missing y.

A worked example

function f(Stringable $x): string {
    return $x->__toString();
}

class Foo {
    public function __toString(): string { return 'foo'; }
}

f(new Foo()); // OK if Foo implements Stringable

The parameter is the named class Stringable. The argument is the named class Foo. The lattice asks the analyser's codebase model: does Foo descend from Stringable? If yes, the call type-checks.

A subtler example with a structural type:

/**
 * @param object{handle: resource} $obj
 */
function process($obj): void { /* ... */ }

The parameter is a structural object{handle: resource}. Passing a Database instance succeeds iff the analyser confirms Database::handle exists with a type that refines resource.

See also: refines for the full subtype rules across the object family; wrappers for the Intersected representation.

Arrays and lists

PHP has one runtime collection type — the array — but the type system distinguishes two views of it: keyed arrays (the general form, with key and value parameters) and lists (int-keyed, contiguous, with a single element type). Both views support generic and shape forms. Shapes can be sealed or unsealed.

PHP-side	Denotes
`array<K, V>`, `array{a: int, b?: string, ...<K, V>}`, `array{}`	Keyed array.
`list<T>`, `non-empty-list<T>`, `list{0: int, 1: string, ...<T>}`	Int-keyed contiguous list.

This chapter describes what each means. The lattice rules for subtyping, meet, join, and so on are in refines, meet, and join.

Keyed arrays

A keyed array can be in one of four states:

State	PHP-side	Notes
Generic	`array<K, V>`, `non-empty-array<K, V>`	`K` and `V` are type parameters. Any entries allowed, all keys constrained by `K`, all values by `V`.
Unsealed shape	`array{a: int, ...<K, V>}`	Specific known entries plus a rest type for unknown entries. The bare `...` is shorthand for `...<array-key, mixed>`.
Sealed shape	`array{a: int}`	Specific entries; no extras.
Empty	`array{}`	The single empty array.

The unsealed shape's rest parameters constrain the additional entries beyond the listed ones ; they say nothing about the listed entries' types. So array{a: int, ...<string, bool>} means: the entry a is an int, and any other entries have string keys and bool values. The bare ... form is the common case where the user does not want to constrain the extras and is exactly equivalent to ...<array-key, mixed>.

Known entries

A shape's entries each carry:

A key (an integer literal like 0, a string literal like "name", or a class constant like Foo::KEY).
A value type.
An optional flag (PHP ?: array{name?: string}).

The lattice's refines chapter has the full required-vs-optional rules.

Sealed vs unsealed

A sealed shape commits to having no entries beyond those listed. An unsealed shape allows additional entries, with their key and value types constrained by the rest parameters (...<K, V>, defaulting to ...<array-key, mixed>). The lattice consequences:

array{a: int, b: string} (sealed) refines array{a: int} (sealed) only if the input has exactly the keys the container does (extras are forbidden).
array{a: int, b: string} (sealed) refines array{a: int, ...} (unsealed) by ignoring the extras.
array{a: int, b: string} refines array{a: int, ...<string, bool>} only if every extra entry beyond a has a string key and a bool value ; here b: string has a string key but a string value, so it fails the rest constraint.
array{} refines every keyed-array container that admits empty (i.e. is not non-empty-array).

`non-empty-array`

The non-empty axis asserts that the array has at least one entry. A shape with at least one required entry is automatically non-empty; the explicit form exists for the generic case.

Lists

A list is structurally an int-keyed array with contiguous keys starting at 0. Suffete carries it as a separate family because PHP analysers commonly distinguish them and many operations are sharper in list-shape (the keys are positional, not arbitrary).

State	PHP-side	Notes
Generic	`list<T>`, `non-empty-list<T>`	Single element type. Any number of entries allowed, all values constrained by `T`.
Unsealed shape	`list{0: int, ...<T>}`	Known prefix plus a rest element type. The bare `list{0: int, ...}` is shorthand for `...<mixed>`.
Sealed shape	`list{0: int, 1: string}`	Fixed-shape list, no extras.
Empty	`list{}`	The single empty list (equivalent to `array{}` viewed as a list).

Known elements

Indices start at 0; required entries must be contiguous from 0. Optional entries can be skipped only at the tail (PHP supports [int, int, ?string] but not [int, ?int, int]).

Cross-family relationships

list<int> refines array<int, int> ; lists are int-keyed arrays.
non-empty-list<int> refines list<int> ; non-empty is a strict refinement.
list{0: int, 1: string} (sealed) refines list<int|string> ; sealed-vs-generic with element-type covering all known types.
An empty sealed list (list{}) is uninhabited if combined with non-empty.

How keyed arrays and lists relate

graph BT
    SealList["list{0: int, 1: int}"] --> NonEmptyList["non-empty-list&lt;int&gt;"]
    NonEmptyList --> List["list&lt;int&gt;"]
    List --> KArray["array&lt;int, int&gt;"]
    SealedKArray["array{0: int, 1: int}"] --> KArray
    KArray --> Empty["array&lt;mixed, mixed&gt;"]

A list refines a keyed-array of compatible parameters when the container's key parameter accepts int and the value parameter accepts the list's element type.

A worked example

/**
 * @param array{name: string, age: int} $u
 */
function f(array $u): void { /* ... */ }

f(['name' => 'Hannibal', 'age' => 24]);  // OK: matches the shape
f(['name' => 'Hannibal']);               // FAIL: missing required 'age'
f(['name' => 'Hannibal', 'age' => 24, 'rank' => 'general']); // FAIL: extra key, sealed

The parameter is a sealed keyed-array shape; the argument types are inferred (or user-annotated) sealed shapes. The sealed-vs-sealed rule fires and decides each case.

Intersections

array<K, V> & SomeConjunct and list<T> & SomeConjunct are expressed via the Intersected wrapper.

See also: refines for the per-pair subtype rules; iterables and callables for how arrays relate to iterable<K, V>; meet for sealed-shape intersection.

Iterables and callables

Two PHP types that are not values themselves but commitments about what a value supports: iterable<K, V> says "this can be foreached over", and callable says "this can be invoked".

`iterable<K, V>`

PHP-side: iterable<K, V>. Denotes any value that can be iterated with foreach: an array, a Traversable, a generator, an IteratorAggregate, and so on.

iterable is its own family because array <: iterable and Traversable <: iterable both hold, but iterable does not commute with arbitrary classes; it must be tracked as its own thing.

Subtyping

iterable is the supertype of every array, list, and Traversable (or descendant). The variance is covariant on K and V: iterable<int, Foo> refines iterable<int, object>.

The lattice handles:

array<K, V> refines iterable<K, V> ; arrays are iterable, with the same key/value parameters.
list<T> refines iterable<int, T> ; lists key by int.
Traversable<K, V> (or any descendant) refines iterable<K, V> ; the analyser supplies the variance via the codebase model.

Note that iterable sits above Traversable<K, V> ; not all iterables are traversables (an array is iterable but not a Traversable).

Intersections

iterable<K, V> & Countable is expressed via the Intersected wrapper.

`callable`

PHP-side: callable, plus the more refined callable(int): string, Closure(int): string, and so on.

Three variants:

Bare callable — no signature commitment. Any callable: a function name string, a Closure, an [$obj, 'method'] array, an __invoke-bearing object.
callable(...): T — the open form with a declared signature. Includes any of the above whose effective signature refines the declared one.
Closure(...): T — strictly the Closure class instance with the declared signature.

Signatures

A signature has parameters, a return type, and a throws clause:

Parameters: each one carries a type, plus optional / variadic / by_reference flags. The optional name is for diagnostic alignment; it does not affect typing.
Return type: covariant.
Throws: declared exception types (PHPDoc).

Variance

The standard rules for callables:

Return type is covariant. A callable returning Foo refines a callable returning Bar iff Foo refines Bar.
Parameter types are contravariant. A callable taking Bar refines a callable taking Foo iff Foo refines Bar.
Optional / variadic / by-reference must align in the obvious way.

How `iterable` relates to `array` and `list`

A partial order:

graph BT
    SealedList["list{0: int}"] --> List["list&lt;int&gt;"]
    SealedArray["array{0: int}"] --> KArr["array&lt;int, int&gt;"]
    List --> KArr
    KArr --> Iter["iterable&lt;int, int&gt;"]

A list refines a keyed-array of compatible parameters; a keyed-array refines an iterable of compatible parameters.

See also: arrays and lists; refines for the variance rules.

Resources

PHP's resource is a runtime handle to an external resource: a file, a database connection, a stream, a curl handle. The type system models this as the resource family, with optional refinements for the resource's open-or-closed state and its named kind.

The two axes

A resource carries two refinement axes:

State

Three values:

Unspecified — the analyser has no knowledge of openness. PHP-side: resource.
Open — the resource is open. PHP-side: open-resource.
Closed — the resource is closed. PHP-side: closed-resource.

Open and Closed are disjoint. Unspecified is the join.

The state tracks the result of fclose($f) and similar calls in the analyser: after fclose($f), the analyser narrows $f from open-resource to closed-resource.

Kind

An optional named kind — resource<curl>, resource<gd image>, resource<stream>. The name corresponds to PHP's get_resource_type($r) return value.

A resource without a named kind admits any kind. With a name, it admits only that specific kind.

Subtyping

graph BT
    OpenCurl["open-resource&lt;curl&gt;"] --> Open["open-resource"]
    OpenCurl --> Curl["resource&lt;curl&gt;"]
    Open --> R[resource]
    Curl --> R
    Closed["closed-resource"] --> R

Open refines Unspecified ; Closed refines Unspecified ; both refine the unrefined.
A named kind refines the unnamed; open-resource<curl> (state = Open, kind = curl) refines both open-resource and resource<curl>.

A worked example

/**
 * @param open-resource $r
 */
function read($r): string { /* ... */ }

$f = fopen('/tmp/x', 'r');     // open-resource<stream>
read($f);                      // OK: open-resource<stream> <: open-resource
fclose($f);                    // narrows $f to closed-resource<stream>
read($f);                      // analyser warns: passed a closed resource

The lattice answers each refines query straightforwardly. The narrowing on fclose is the narrow operation with an analyser-supplied assertion that the resource is now Closed.

See also: refines for the per-pair rules; narrow for state-flip narrowing.

Class-like strings

A class-like string is a string value whose runtime content is the fully-qualified name of a class, interface, trait, or enum. PHP-side: class-string, class-string<Foo>, interface-string, enum-string, trait-string. The type system uses these to track the connection between a string-as-name and the class it refers to.

The two axes

Every class-like string has two axes:

Kind

Five values, one per family:

class — only class names (excluding interfaces, traits, enums).
interface — only interface names.
trait — only trait names.
enum — only enum names.
unspecified — any of the above.

unspecified is the join of the others. The lattice's refines chapter has the per-pair rules.

Specifier

How specific the class-name is:

Unspecified — the class-string form, no commitment to a particular class.
Literal — the class-string<Foo> form, where the FQN is known.
Of a type — class-string<T> where T is some bounded form (the analyser substitutes through this when T is bound).
Generic — the ::class lookup on a generic parameter, before the parameter is substituted.

Subtyping

A class-like string refines another by both axes:

graph BT
    L["class-string&lt;App\\Foo&gt;"] --> ClassUnspec["class-string"]
    ClassUnspec --> AnyUnspec["any class-like-string"]
    InterfaceUnspec["interface-string"] --> AnyUnspec
    AnyUnspec --> Str[string]

The kind axis: class $\mathrel{<:}$ unspecified, interface $\mathrel{<:}$ unspecified, etc. Different specific kinds are disjoint (a class-string is not an interface-string).
The specifier axis: a literal class-string<Foo> refines a literal class-string<F> iff the analyser confirms Foo descends from F.

The cross-cutting check is that a class-like string is also a string ; every class-like string refines string.

Why a separate form

PHP's class-string<Foo> is not just "a string that happens to spell Foo"; it carries the analyser's knowledge that the string can be passed to new, used in is_a(), paired with instanceof, and so on. Modelling it as a refinement of a plain string would lose the connection. Modelling it as its own form preserves the connection and lets the lattice apply class-hierarchy reasoning where needed.

A worked example

/** @param class-string<Throwable> $cls */
function throw_named(string $cls): never {
    throw new $cls('boom');
}

throw_named(RuntimeException::class);  // OK
throw_named(stdClass::class);          // FAIL: stdClass does not descend from Throwable

The parameter is class-string<Throwable>. The arguments are class-string<RuntimeException> and class-string<stdClass>.

The lattice asks the analyser: does RuntimeException descend from Throwable? (yes). Does stdClass? (no). It returns the matching answer.

Crossing into `string`

A class-string<Foo> refines string directly. A string does not refine class-string<Foo> — most strings are not class names.

The other direction matters for narrowing: an analyser observing class_exists($s) returning true can narrow $s from string down to class-string. The narrow operation handles this when the analyser supplies the assertion.

See also: scalars for the string family; refines for the per-pair rules.

A refinement axis is a boolean (or low-cardinality) property carried directly on a type, not as a separate type. The lattice treats axes as bits on the same form: setting an axis tightens the type without changing what it is.

Axes are how suffete avoids exploding the universe. There is no separate "non-null mixed" form distinct from mixed ; mixed carries a non-null axis. There is no separate "non-empty string" form distinct from string ; string carries a non-empty axis. The lattice rules work over axes by checking that every axis the container constrains is implied by the input.

The general rule

Given a form $F$ that carries an axis $a$, and a container of form $F$ with $a$ set:

An input of the same form $F$ refines the container iff its $a$ is also set.
An input of a different form $F'$ refines the container iff $F'$ structurally guarantees $a$.

The "structurally guarantees" half does the work. int structurally guarantees non-null (every integer is not null), so int refines non-null mixed even though int is not mixed.

Axes on `mixed`

Four axes:

Axis	PHP-side	Refinement
non-null	`non-null mixed`	The value is not `null`.
truthy	`truthy mixed`	The value is truthy at runtime.
falsy	`falsy mixed`	The value is falsy at runtime.
empty	analyser-internal	The value is `falsy` (paired with the `falsy` truthiness).
isset-from-loop	analyser-internal	The value flowed through a loop body.

Combinations are valid. A truthy value cannot be null (PHP), so truthy mixed already implies non-null.

Structural implications

What other types satisfy each axis:

non-null is implied by every form except null, void, and mixed without the non-null axis. So int, string, Foo, array, list<int>, resource, callable all refine non-null mixed.
truthy is implied by:
- true
- any object (objects are truthy in PHP)
- any resource
- non-zero integer literals; integer ranges that exclude 0
- non-zero float literals
- string literals that are not "" or "0"
- truthy-string and non-empty-string (with the truthy axis explicitly set)
- non-empty-array, non-empty-list
falsy is implied by:
- false, null, void
- 0, 0.0 (and -0.0)
- "" and "0"
- the empty array, the empty list
empty is the same as falsy for refinement purposes.
isset-from-loop propagates only on mixed itself; no other form structurally implies it.

Axes on `string`

Five axes:

Axis	PHP-side	Refinement
non-empty	`non-empty-string`	Length ≥ 1.
truthy	`truthy-string`	Truthy at runtime (excludes `""` and `"0"`).
lowercase	`lowercase-string`	Every character is lowercase.
uppercase	`uppercase-string`	Every character is uppercase.
numeric	`numeric-string`	Parses as int or float.

Combinations are valid: non-empty-lowercase-string carries both axes.

Refinement direction: more axes = stricter. non-empty-lowercase-string refines non-empty-string, lowercase-string, and string. The same rule as for mixed: every axis the container constrains must be carried by the input.

Structural implications

A string literal carries every axis it satisfies:

non-empty iff the literal is not "".
truthy iff the literal is not "" and not "0".
lowercase iff every character is lowercase.
uppercase iff every character is uppercase.
numeric iff the literal parses as an int or float.

Axes on resources

resource carries two refinements:

state — Open, Closed, or unspecified.
kind — an optional named kind (curl, gd image, ...).

Subtyping is the conjunction: state must match (or be unspecified on the container) and kind must match (or be unspecified on the container). See resources.

Axes on objects

A named class carries three modality flags:

$this — the object is exactly the receiver.
static — the object is the late-static class.
A class can be analyser-known final; the universe does not flag this on the type itself but the lattice consults the analyser at refinement time.

$this implies static. The modality check at refinement time:

A container marked $this accepts only $this inputs.
A container marked static accepts static or $this inputs.
A plain Foo accepts any of the above.

A structural object shape carries sealed as its main axis (see objects).

Axes on arrays and lists

Both kinds carry:

non-empty — at least one entry.
sealed — the shape commits to having no entries beyond those listed.

Both follow the standard "more axes = stricter" rule.

Why axes, not wrappers

A natural alternative would be to wrap a type in a refining wrapper: an outer "non-null" wrapping a mixed. Suffete uses axes for two reasons:

No rule duplication. With axes attached to the type, the rule for "does this axis hold?" lives in one place. With wrappers, every other type would need to know how to interact with the wrapper.
Compactness. A non-null truthy lowercase string is one type carrying three bits, not a chain of three nested wrappings.

The trade-off is that the forms carrying axes have more elaborate rules. Those rules are tested against the algebraic-law battery (laws) so the elaboration does not introduce soundness drift.

A worked example

The PHP type truthy lowercase non-empty-string carries three string axes: non-empty, truthy, and lowercase.

Refinement:

The literal "foo" refines it ; "foo" is non-empty, truthy, and lowercase.
"FOO" does not — fails lowercase.
"" does not — fails non-empty and truthy.
"0" does not — fails truthy.
non-empty-string (only one axis) does not — missing lowercase and truthy.
lowercase-string (only one axis) does not — missing non-empty and truthy.

See also: scalars for the string family in full; special elements for the mixed axes; refines for the per-axis lattice rules.

Wrappers: negation and intersection

Two forms wrap other types: they exist to compose the universe with set complement and set intersection. They are the universe's two combinators, and they account for most of the lattice's complexity ; whenever the rules need to do something nontrivial, it's usually because a wrapper is involved.

Form	PHP-side	Denotes
Negation	`!T`, `!int`, `!Foo`	Set complement: every value not in $T$.
Intersection	`Foo & Bar & Baz`	Set intersection of two or more types.

Negation

PHP-side: !T. Denotes the set complement of $T$ — every value in the PHP universe except those in $T$.

Subtyping

The standard rules:

$\mathit{never} \mathrel{<:} \neg \tau$ for every $\tau$ ; never is in every set's complement.
$\tau \mathrel{<:} \neg \sigma$ iff $\tau \sqcap \sigma \equiv \bot$ (i.e. $\tau$ and $\sigma$ are disjoint).
$\neg \sigma \mathrel{<:} \tau$ iff $\neg \tau \mathrel{<:} \sigma$ (Boolean duality, when both inner types are well-formed for the rule).
$\neg \neg \tau \equiv \tau$ ; double negation collapses.
$\neg \mathit{never} \equiv \top$ (vanilla mixed); $\neg \top \equiv \mathit{never}$.

Negation in narrowing

The most common use of negation is in narrowing. After !is_int($x), the analyser narrows $x by the type !int. The lattice's subtract operation handles this directly.

Intersection

PHP-side: Foo & Bar & Baz. Denotes the set intersection of all parts.

An intersection is represented as a head and a list of conjuncts. Foo & Bar & Baz has Foo as head and [Bar, Baz] as conjuncts. The head is canonicalised so that Foo & Bar and Bar & Foo are the same intersection.

One representation, always

Every PHP intersection is expressed as the same wrapper, regardless of the head's type. Foo & Bar, int & literal-int, array<int, V> & Countable, T & Foo (a generic parameter conjoined with a class) — they all use the wrapper. There is no in-payload conjunct list on individual types.

Subtyping

The standard rules from PL theory:

$\tau \mathrel{<:} (H \sqcap C_1 \sqcap \dots \sqcap C_n)$ iff $\tau \mathrel{<:} H$ and $\tau \mathrel{<:} C_i$ for every $i$ (intersection on the right is a conjunction of refinements).
$(H \sqcap C_1 \sqcap \dots \sqcap C_n) \mathrel{<:} \sigma$ iff $H \mathrel{<:} \sigma$ or some $C_i \mathrel{<:} \sigma$ (intersection on the left is a disjunction; some side must do the work).

The asymmetry — intersection on the left is a disjunction, intersection on the right is a conjunction — is the standard "Int-L / Int-R" rule.

Uninhabited intersections

An intersection can be uninhabited — Foo & !Foo is empty. The construction phase detects the trivial cases:

H & !H (a conjunct that negates the head) collapses to $\bot$.
A conjunct that is lattice-disjoint with the head also collapses.

The full uninhabited check is part of overlaps.

A worked example

The PHP type:

Stringable & Countable & !Iterator

is an intersection with Stringable as head, and [Countable, !Iterator] as conjuncts.

A class Foo refines this iff:

$\mathit{Foo} \mathrel{<:} \mathit{Stringable}$ (the analyser's codebase confirms Foo implements Stringable).
$\mathit{Foo} \mathrel{<:} \mathit{Countable}$ (similarly for Countable).
$\mathit{Foo} \mathrel{\#} \mathit{Iterator}$ (Foo does not implement Iterator).

The lattice asks the codebase for the first two and uses overlaps for the third.

See also: refines for the Int-L / Int-R rules; overlaps for the uninhabited-detection logic; meet for how intersections are constructed during the meet operation; subtract for how negations are introduced.

Unresolved elements

A handful of element kinds name a future resolution rather than carrying its result. They appear in the type universe because the analyser needs a way to express "the type that you'd get if you looked this up", and they get replaced by structural elements through the expansion operation when the lattice needs to reason about them.

If you call refines(τ, σ, ...) with an unresolved element on either side, the contract is that the analyser has already expanded it. The lattice itself does not invoke expansion; it would loop forever or worse if it did. The expand chapter has the rules for when and how the analyser does this.

Form	PHP-side	Denotes
Alias	`/** @type UserId = int */`	A user-introduced name for a type.
Reference	`T` (a referenced template before binding)	A name that refers to some declared template parameter or alias.
Member reference	`Foo::T`, `self::T`	A reference to a class-member type.
Global reference	`T` declared at the file level	A reference to a global type variable.
Conditional	`T extends U ? X : Y`	A conditional type, resolved by checking subject vs target.
Derived	`key-of<T>`, `value-of<T>`, `T[K]`, ...	A type derived from another by a transformation.
Variable	analyser-introduced	A placeholder for an inferred-but-not-yet-pinned type.

Aliases

A user-introduced name for a type, declared at the package or class level:

/** @type UserId = int */

Aliases are nominal: two distinct aliases with identical underlying types are not equal. UserId and int are not the same alias even when UserId is declared as int. Expansion turns them into the same structural type, but the alias preserves the name for diagnostics and refactoring.

References

A reference is a name that has not yet been bound to a specific declaration. Used during PHPDoc parsing and resolution, before the analyser has determined whether the name refers to a class, an alias, a template parameter, or something else.

Expansion resolves the name through the analyser's symbol table:

If the name is a class-like, becomes a named class.
If the name is an alias, becomes the alias's underlying type.
If the name is a template parameter, becomes a generic parameter.
If the name is unknown, expansion fails and the analyser surfaces a diagnostic.

References can carry their own type arguments: Foo<int> is one reference. Intersections like Foo<int> & Bar use the Intersected wrapper.

Member references

A reference to a type declared as a member of another type:

/** @return self::ItemType */

Expansion looks up the name on the resolved class via the analyser's codebase model, substitutes the class's template environment, and returns the resulting type.

Global references

A reference to a global type variable (rare, but used in some PHPDoc dialects):

/** @global TItem $item */

Expansion looks the name up in the analyser's global type variable table.

Conditional types

PHP-side: T extends U ? X : Y. Has four pieces: a subject, a target, a then branch, and an otherwise branch.

Expansion checks the subject <: target test under the current template environment. If true, the result is then; otherwise, the result is otherwise.

Conditional types are most useful inside generic declarations:

/**
 * @template T
 * @return ($t extends int ? string : bool)
 */
function f($t) { ... }

After substitution T := int(7), the conditional resolves to string.

Derived types

A family of derived-type forms, each describing a transformation:

key-of<T> — the keys of T if it's an array, list, or shape. key-of<array{a: int, b: string}> expands to 'a' | 'b'.
value-of<T> — the values of T. value-of<array{a: int, b: string}> expands to int | string.
T[K] — index access; the type of accessing key K on T. Used for shape lookups.
properties-of<T> — the properties of T as a structural shape.
int-mask-of<T> / int-mask<L> — bitmask types, for flags.
new<T> — the type of new T(...) ; useful when T is a class-string.
template-of-class — the type of template parameter P on class C as instantiated on object O. Resolved by specialise.

Expansion routes each variant to the matching transformation, which itself may produce more unresolved kinds (for example, key-of<T> where T is itself a reference).

Inference variables

The analyser's own placeholder for an inferred-but-not-yet-pinned type. Variables are introduced by the analyser during inference and resolved by the analyser when inference completes. The lattice does not see them in finalised types.

Why they exist in the universe

The alternative would be to keep unresolved forms as a separate AST and only build types for resolved cases. Suffete prefers a single universe for two reasons:

Uniform structure. Every type — resolved or not — is a finite union of elements. Walkers, transforms, and serialisation work on every type without a special case for "is this resolved?".
Lazy expansion. Resolving every reference at construction time would force the analyser to have a complete codebase model before constructing any type. The unresolved forms let the analyser construct types eagerly and resolve them on demand.

The trade-off is the rule above: lattice operations on unresolved forms are not directly defined. The analyser must expand first.

A worked example

/**
 * @template T
 * @param T $item
 * @return ($T extends list<infer U> ? U : T)
 */
function head($item) { ... }

The return type is a conditional with:

subject = T
target = list
then = U
otherwise = T

Calling head(['a', 'b']) gives the analyser an argument type list<string>. Inference (see standin) binds T := list<string>, and U := string is bound by the conditional's match.

Expansion then sees the subject <: target test holds and picks the then branch, yielding string.

Without expansion, the lattice cannot see any of this ; the conditional is opaque to it.

See also: expand for the operation that resolves these; generics for the template machinery.

Templates and generic parameters

A template parameter is a placeholder in a generic declaration: T on a class Box, K/V on a class Map, T on a function id. PHP has no native generics syntax; they are introduced by @template T (or @template-covariant, @template-contravariant) in a PHPDoc block.

This chapter covers what a free template parameter is in the type universe. The operations that act on it — substitution, inference, specialisation — are in Part IV.

What a template parameter carries

A template parameter has four pieces of information:

A name (T, K, V, ...) — the user-visible identifier.
A defining entity — the class-like, function, method, or closure that declared it. Two parameters with the same name on different entities are different parameters: T on class Box is distinct from T on class Foo.
An upper bound (the constraint) — PHP-side, the T extends Foo clause. Defaults to mixed.
An optional qualifier — used at the use site for specific forms like T::class.

The name plus the defining entity together identify the parameter. Substitution, inference, and specialisation all key on this pair.

Subtyping

A free template parameter behaves like its constraint for subtype questions:

$T \mathrel{<:} \tau$ iff $\mathit{constraint}(T) \mathrel{<:} \tau$.
$\tau \mathrel{<:} T$ iff $\tau \mathrel{<:} \mathit{constraint}(T)$ ; with caveats (the variance-aware rules in refines).

Two parameters that share the same (name, defining_entity) are the same thing. Two with different defining entities are distinct even when their names match.

How free parameters appear in types

Inside the body of a generic class, a free T shows up wherever it is used:

/**
 * @template T
 */
class Box {
    /** @var T */
    public mixed $value;

    /** @return T */
    public function get() { return $this->value; }

    /** @param T $value */
    public function set($value): void { $this->value = $value; }
}

The field $value, the return of get, and the parameter of set all reference the same T (same name, same defining entity).

When the user instantiates Box<int> (in a PHPDoc context such as @var Box<int>), the analyser substitutes every T for (name="T", defining_entity=Box) with int, producing the concrete versions of the field, getter, and setter types.

Free vs bound

A template parameter is free until it is substituted. After substitution, the parameter is gone — replaced with the concrete type.

A type that contains a free template is not "wrong" — it just has not been instantiated yet. The lattice still answers questions about it (using the constraint as the upper bound), but the answers are conservative: $T \mathrel{<:} \tau$ holds iff $\tau$ accepts every value the constraint admits.

Variance

The variance of a parameter is declared at the class level, not on the parameter itself. The analyser registers each class's parameter list with the codebase model; the lattice asks for the declared variance when it needs it during a refines query on instantiated classes.

Variance is one of:

Covariant ; subtype of T produces subtype of Box<T>.
Contravariant ; supertype of T produces subtype of Box<T>.
Invariant ; only the same T produces a refinement.

See the variance chapter for the full rules.

Default-filled parameters

When a class declares a T parameter and the user references the class as plain Box (rather than Box<int>), the analyser fills T with the constraint (mixed by default) and marks the resulting type as default-filled. The marker rides along with the type wherever it is later nested.

The variance check at refinement time consults the marker: a default-filled type-argument is allowed to flow either direction (subject to variance) without producing a strict refinement failure ; the lattice records the use of the default on its report so the analyser can warn about the unpinned position.

A worked example

/**
 * @template T
 */
class Box {
    /** @var T */
    public mixed $value;
}

/** @var Box<int> $ints */
$ints = new Box();

Inside the class body, the field $value has type T (free).

After the analyser sees the @var Box<int> annotation, it builds the instantiated Box. It walks the class's stored field types and substitutes each T with int, producing the concrete field type for this instance.

The T parameter itself is not modified ; substitution is a pure function returning new types.

See also: Substitute for the substitution operation; Variance for how variance is declared and used; Standin for the inference round that binds free parameters.

Subtyping: refines

The subtype relation $\tau \mathrel{<:} \sigma$ is suffete's most important operation. Read it as: every value of type $\tau$ is also a value of type $\sigma$.

Reading the rules

The lattice rules read as a dispatch tree: first, the universal axioms ; then, family-specific rules ; then, fall through to false. The dispatch happens at the Element level, not the Type level: the Type-level rule is just "every Element on the left must refine some Element on the right", with a few cross-cutting collapse rules.

Type-level rule

For Types $\tau, \sigma$:

$$\tau \mathrel{<:} \sigma \quad\iff\quad \forall e \in \tau,;; e \mathrel{<:} \sigma$$

(Read $e \mathrel{<:} \sigma$ as: there exists some Element in $\sigma$ that the Element $e$ refines, OR a fan-out coverage rule fires that lets a single Element be covered by a union of Elements on the right.)

The fan-out rules let the lattice handle cases like int<-∞, 0> refining int<-∞, -1> | int(0) ; no single Element on the right covers it, but the union does. The rules are:

Int union coverage — an integer range may be covered by a sorted union of integer ranges and literals.
String union coverage — a string literal is covered by an unspecified string with matching axes, or by another literal.
List union coverage — a sealed list is covered by a union of unsealed lists with element types covering the known positions.
Array union coverage — same for keyed arrays.
Intersected partition coverage — an intersection may be covered by a union if each part of the intersection is.
Negation partition coverage — a negation of a finite-cover universe (such as bool or an enum) may be covered by the union of its complement.

These coverage rules are why the rules are more elaborate than the textbook "for-each-on-the-left, exists-on-the-right" rule.

Universal axioms

Rule	Reading
Reflexivity	$a \mathrel{<:} a$ for every Element.
Bottom	$\bot \mathrel{<:} a$ for every Element. ($\bot$ has no values; the universal "every value of $\bot$ is a value of $a$" holds vacuously.)
Top	$a \mathrel{<:} \top$ for every Element. (Every value is in $\top$.)
Placeholder	placeholder refines and is refined by everything (inference convenience; should not appear in finalised types).

The order is significant: the universal axioms fire first, the family dispatch last.

Family dispatch

Dispatch is by the family of the right-hand side (the container) and then by the family of the left-hand side (the input). One rule set per family (objects, arrays, scalars, etc.) plus the wrappers and unresolved elements.

Container	Input	Rule
`int`	`int`	range / literal subtyping
`float`	`float`	float-family rules
`string`	`string`, class-like-string	string axis subset, literal flags
class-like-string	class-like-string	kind + specifier rules
`bool`, `true`, `false`	`bool`, `true`, `false`	trivial subtype rules
object family	object family	named, shape, has-method, etc.
`array`, `list`	`array`, `list`	sealed, optional, variance
`iterable`	`array`, `list`, named class	`Traversable` etc.
`callable`	`callable`	return covariant, params contravariant
`resource`	`resource`	state + kind axes
`scalar` / `numeric` / `array-key`	(decomposed)	true-union dominator covers
negation	(anything)	the negation rules below
intersection	(anything)	the Int-L / Int-R rules below
`mixed`	(anything)	narrowed-mixed axis rules
generic parameter	(anything)	constraint projection
alias / reference / ...	(anything)	unresolved — caller must expand first

The wrapper rules

Negation on the right

$a \mathrel{<:} \neg b$ iff $a \sqcap b \equiv \bot$ (i.e. $a$ is disjoint from $b$).

Example: int <: !string iff int is disjoint from string ; true.

Negation on the left

$\neg a \mathrel{<:} b$ iff $\neg b \mathrel{<:} a$ (Boolean duality), with caveats. When $a$ has a finite cover (e.g. bool is true | false), the lattice can directly check whether the complement is contained in $b$.

Intersection on the right (Int-R)

$\tau \mathrel{<:} (H \sqcap C_1 \sqcap \dots \sqcap C_n)$ iff $\tau \mathrel{<:} H$ and $\tau \mathrel{<:} C_i$ for every $i$.

Conjunction-of-refinements: the container demands $\tau$ refine every part.

Intersection on the left (Int-L)

$(H \sqcap C_1 \sqcap \dots \sqcap C_n) \mathrel{<:} \sigma$ iff $H \mathrel{<:} \sigma$ or some $C_i \mathrel{<:} \sigma$.

Disjunction-of-refinements: the intersection refines $\sigma$ when some part already does.

The asymmetry is critical: intersection on the left is easier to refine into something (any part can do the work); intersection on the right is harder to refine into (every part must accept).

The mixed-axes rule

If the container is mixed with axes $A$, the input refines it iff every axis in $A$ is implied by the input. The implications are listed in refinement axes.

Example: int <: non-null mixed ; the non-null axis is structurally guaranteed by int.

Generic-parameter projection

Free template parameters are treated as their constraint for the purpose of subtyping.

Left: $T \mathrel{<:} \sigma$ iff $\mathit{constraint}(T) \mathrel{<:} \sigma$.
Right: $\tau \mathrel{<:} T$ iff $\tau \mathrel{<:} \mathit{constraint}(T)$ (with extra checks; see variance).

The same parameter on both sides reflexively refines (rule 1, equality). Two parameters with different defining entities are different parameters; whether they refine each other depends on their constraints.

Coercion edges

PHP allows certain non-subtype coercions on parameter boundaries: int can flow into a float parameter, for example. The lattice has a single toggle to admit these on demand:

Edge	Source	Target
`int` → `float`	int parameter	float parameter
`numeric-string` → `int`	string parameter	int parameter (with cast)
`numeric-string` → `float`	string parameter	float parameter (with cast)

The default is to admit these (matches PHP's loose-types behaviour); analysers in strict-types mode disable them. When a coercion edge fires, it is recorded so the analyser can warn.

Disjointness

refines decides $\tau \mathrel{<:} \sigma$. For the symmetric question "do these two types share any value?", use overlaps.

Visualising the dispatch

flowchart TD
    Start[Start: refines a, b] --> Refl{a == b?}
    Refl -->|yes| True
    Refl -->|no| Top{is b == mixed?}
    Top -->|yes| True
    Top -->|no| Bot{is a == never?}
    Bot -->|yes| True
    Bot -->|no| Phold{placeholder on either side?}
    Phold -->|yes| True
    Phold -->|no| Wrap{a or b a wrapper?}
    Wrap -->|negation| NegRule[Apply negation rule]
    Wrap -->|intersection| IntRule[Apply Int-L / Int-R]
    Wrap -->|no| Generic{a is a generic parameter?}
    Generic -->|yes| Project[Recurse on constraint]
    Generic -->|no| Family[Family dispatch]
    NegRule --> End[Return]
    IntRule --> End
    Project --> End
    Family --> End
    True --> End

A worked example

/** Inputs */
$a = int|string;
$b = int|string|null;

$a <: $b ; every Element of int|string (namely int and string) refines some Element of int|string|null. Trivial.

$b <: $a is false ; the null on the left does not refine any Element on the right. No fan-out applies.

See also: overlaps for the symmetric "share a value?" question; meet for the operation that produces a type rather than a boolean; laws for the algebraic identities the implementation is required to satisfy.

Overlap and disjointness

The overlap relation $\tau \mathrel{\#} \sigma$ asks the symmetric question: do $\tau$ and $\sigma$ share at least one value?. Its negation is disjointness.

Overlap holds iff there exists some value $v$ such that $v \in \tau$ and $v \in \sigma$. Disjointness holds iff no such value exists.

Overlap is the test you want when neither side dominates the other. int<0,10> and int<5,15> overlap on the value 7, but neither is a subtype of the other. Refines does not see this; overlap does.

Why both refines and overlaps

Refines is one-directional: $\tau \mathrel{<:} \sigma$ tells you if every value of $\tau$ is also in $\sigma$. Overlap is symmetric: $\tau \mathrel{\#} \sigma$ tells you if any value is in both.

The two are independent:

$\tau \mathrel{<:} \sigma$ implies $\tau \mathrel{\#} \sigma$ when $\tau$ is inhabited.
$\tau \mathrel{\#} \sigma$ does not imply $\tau \mathrel{<:} \sigma$ or $\sigma \mathrel{<:} \tau$.

In the analyser, refines answers parameter-vs-argument questions; overlap answers questions like "could this instanceof Foo ever succeed?" (yes iff the variable's type overlaps with Foo).

The Type-level rule

For Types $\tau, \sigma$:

$$\tau \mathrel{\#} \sigma \quad\iff\quad \exists e_1 \in \tau, e_2 \in \sigma ;:; e_1 \mathrel{\#} e_2$$

Some pair of Elements, one from each side, must overlap. The Element-level rule does the work.

The Element-level dispatch

The Element-level rule fires its tests in this order:

never axiom — if either side is never, the answer is false. never has no values; the empty set overlaps nothing.
Uninhabited check — if either side is uninhabited (e.g. Foo & int, the empty sealed shape), the answer is false.
Reflexivity — identical inhabited Elements overlap.
Top — mixed overlaps every inhabited Element.
Placeholder — overlaps everything by inference convention.
Generic-parameter projection — a free template parameter overlaps $\sigma$ iff its constraint overlaps $\sigma$.
Subsumption shortcut — if $a \mathrel{<:} b$ or $b \mathrel{<:} a$, they overlap (one inhabits the other).
Family-specific overlap rules — last resort.

The order matters. The never axiom must fire before the reflexivity rule, otherwise never would (wrongly) be reported as overlapping itself.

The `never` axiom

never overlaps with nothing, including itself.

In set-theoretic terms: never is the empty set. The intersection of the empty set with anything is the empty set. So overlap with never is false, always.

Subsumption shortcut

If $a \mathrel{<:} b$ and $a$ is inhabited, then $a$ has some value, and that value is in $b$, so $a \mathrel{\#} b$. The lattice uses this as an early-exit: try refines(a, b) and refines(b, a), and if either holds, return true.

This makes the common cases (one side dominates the other) fast and avoids re-implementing the case analysis the family rules already do.

Family-specific positive rules

When neither subsumption nor the universal axioms fire, the lattice consults the family-specific rules. These add precision in cases where neither side dominates:

Int ranges: int<0,10> overlaps int<5,15> because both contain 5..=10.
String literals and refinements: "hello" overlaps non-empty-string (the literal satisfies the axis).
Class hierarchies: Foo overlaps Bar if the world says they share a descendant, or one is final and structurally satisfies the other.
Iterables and arrays: iterable<int, int> overlaps array<int, int> ; arrays are iterables.
Narrowed mixed: cross-axis overlap rules.

The family rules are incomplete by design: adding a positive rule never weakens correctness (the relation is monotone in true outcomes). Missing rules cost precision but not soundness ; a downstream narrow returns never instead of a real overlap, and the analyser is conservative rather than wrong.

Uninhabited Elements

Some Elements have no values:

never itself.
An intersection whose head and some conjunct are disjoint, or whose head is never, or where two conjuncts are disjoint (e.g. Foo & int).
A sealed object shape with no properties.

Note that array{} (the empty sealed array) is not uninhabited ; it has exactly one inhabitant, the empty array.

A worked example

Consider three integer ranges:

$r_1 = \texttt{int<0, 10>}$
$r_2 = \texttt{int<5, 15>}$
$r_3 = \texttt{int<20, 30>}$

Then:

$r_1 \mathrel{\#} r_2$ ; they share 5..=10.
$r_1$ does not overlap $r_3$ ; the ranges are disjoint.

Visualising overlap vs refines

graph TD
    Int[int]
    Range1["int&lt;0, 10&gt;"]
    Range2["int&lt;5, 15&gt;"]
    Lit5["int(5)"]
    Lit20["int(20)"]
    Range1 --> Int
    Range2 --> Int
    Lit5 --> Range1
    Lit5 --> Range2
    Lit20 --> Int

In this diagram:

int<0,10> overlaps int<5,15> ; they share int(5)..int(10). Neither refines the other.
int(5) refines int<0,10> ; subsumption. Also they overlap.
int<0,10> does not overlap int(20) ; disjoint.

A subtle case: empty types overlap with nothing

never does not overlap itself. There are no values in never, so the question "do they share a value?" is vacuously false. This is correct set-theoretically and is what the analyser expects ; an instanceof against an uninhabited type is unreachable, not always-true.

When overlaps and refines disagree

By construction, when both terms are inhabited:

$\tau \mathrel{<:} \sigma$ implies $\tau \mathrel{\#} \sigma$.
$\sigma \mathrel{<:} \tau$ implies $\tau \mathrel{\#} \sigma$.

But:

$\tau \mathrel{\#} \sigma$ does not imply either direction of refines.

If you find a case where refines holds and overlaps does not (with an inhabited subtype), that is a soundness violation. The laws chapter has the algebraic identity that catches this in CI.

See also: refines for the asymmetric subtype relation; meet for the operation that produces a type representing the overlap; laws for the soundness interlock between refines and overlap.

Greatest lower bound: meet

The meet $\tau \sqcap \sigma$ is the largest type that refines both $\tau$ and $\sigma$. PHP-side: the type-level intersection.

What "greatest lower bound" means

Among all types $\rho$ such that $\rho \mathrel{<:} \tau$ and $\rho \mathrel{<:} \sigma$:

$\rho = \mathit{never}$ always works (vacuously).
The lattice's job is to find the largest such $\rho$.

Operationally: meet is the type whose value-set is the intersection of $\tau$'s and $\sigma$'s value-sets, expressed as concretely as the kind system allows.

Examples:

$\tau$	$\sigma$	$\tau \sqcap \sigma$
`int`	`string`	`never`
`int\|string`	`int\|null`	`int`
`int<0,10>`	`int<5,15>`	`int<5,10>`
`mixed`	`int`	`int`
`Foo`	`Bar` (sibling classes)	`Foo & Bar`
`Foo`	`!Foo`	`never`
`non-null mixed`	`int\|null`	`int`
`array<int, mixed>`	`array<int, string>`	`array<int, string>`

Type-level rule

For every Element pair $(e_1, e_2)$ where $e_1 \in \tau$ and $e_2 \in \sigma$, compute the per-pair meet, and union the results. If every pair is disjoint, the result is never.

$$\tau \sqcap \sigma ;=; \bigsqcup_{e_1 \in \tau,; e_2 \in \sigma} (e_1 \sqcap e_2)$$

Element-level dispatch

For Elements $a, b$, the per-pair meet fires its tests in this order:

Uninhabited check — if either is uninhabited, the meet is never.
Reflexivity — $a \sqcap a = a$.
never axiom — $\bot \sqcap a = \bot$.
Top — $\top \sqcap a = a$.
Placeholder — placeholder $\sqcap a = a$ by inference convention.
PHP runtime separation — int and float are runtime-coercible but not subset-related; treat them as disjoint at the value level.
Wrappers — Negated and Intersected dispatch (below).
Narrowed mixed — axis-AND rules (below).
Subsumption — if one refines the other, return the smaller.
Generic-parameter projection — meet via constraint.
Family-specific intersection rules — last resort.

The wrapper rules

Meet involving negation

The standard set-theoretic identity:

$$a \sqcap \neg b ;=; a \setminus b$$

This routes to subtract. For int $\sqcap$ !int(0), the result is int<-∞, -1> | int<1, ∞>.

The dual case ($\neg a \sqcap b = b \setminus a$) is handled symmetrically.

The case $\neg a \sqcap \neg b$ uses De Morgan: $\neg(a \sqcup b)$. The result is a single negation wrapping the join.

Meet involving intersection

Intersection with anything adds a new conjunct: $(H \sqcap C_1 \sqcap \dots \sqcap C_n) \sqcap X = H \sqcap C_1 \sqcap \dots \sqcap C_n \sqcap X$. The universe canonicalises by sorting and deduplicating conjuncts.

Intersection meet intersection flattens both conjunct lists and re-intersects.

If the new conjunct is incompatible with the head or another conjunct, the meet collapses to never.

The narrowed-mixed meet

Meet of two narrowed mixed Elements ANDs their axes: non-null mixed $\sqcap$ truthy mixed = non-null truthy mixed. If the axes are inconsistent (e.g. truthy mixed $\sqcap$ falsy mixed), the result is never.

Meet of a mixed-with-axes and a non-mixed Element: the non-mixed Element is returned, but only if it satisfies the axes ; otherwise never. non-null mixed $\sqcap$ int = int. non-null mixed $\sqcap$ null = never.

Some narrowings decompose into multiple Elements. truthy mixed $\sqcap$ int produces int<-∞, -1> | int<1, ∞> (the 0 is excluded). The result is still a single type, just with two Elements.

Family-specific intersection rules

Each family has its own intersection logic:

Int / Int: range intersection. int<0,10> $\sqcap$ int<5,15> = int<5,10>. Empty intersection → never.
Float / Float: literal-vs-literal equality, otherwise never.
String / String: literal-vs-literal equality; unspecified-with-axes $\sqcap$ literal returns the literal if it satisfies the axes; axis-AND otherwise.
String / class-like-string: literal "Foo" $\sqcap$ class-string<Foo> returns the literal if it names the class; otherwise never.
Object / Object: same class → per-position parameter meet; different classes → an Foo & Bar intersection.
Object / enum: never (objects and enums are disjoint).
Object / object shape: an object & object{...} intersection, unless the world says the class doesn't have the shape's properties.
Object / has-method, has-property: similarly an intersection.
Array / Array: sealed-vs-sealed key-by-key; sealed-vs-generic; generic-vs-generic per-parameter.
List / List: same idea, by index.
Iterable / Array, List: per-parameter, then return the more specific kind.
Callable / Callable: signature meet (return type meet, parameter type joins for contravariance, throws meet).

Sealed shape meet

A specifically tricky case. Meet of two sealed shapes:

array{a: int, b?: string} ∩ array{a: int<0,100>, c: bool}

The result is a sealed shape with the union of keys, each value being the meet of the per-key types:

a: int $\sqcap$ int<0,100> = int<0,100> ; required (both required).
b: only on the left, optional on the left, missing on the right. A required key on the right would force absence on the left ; here b is optional on the left, so the result has b?: string (still optional).
c: only on the right, required. This is a problem — the right requires c but the left does not have c. The result is never because no value satisfies both.

The full sealed-shape-meet logic checks each key against the other side's known items and the sealed bit, and returns never if the result would be uninhabited.

A worked example

/**
 * @param int<0, 10> $x
 * @param int<5, 15> $y
 */

The meet of these two parameter types is int<5, 10>.

/**
 * @param int|string|null $a
 * @param int|null $b
 */

The meet is int|null ; the string Element on the left has no overlap with the right.

Visualising the result lattice

graph BT
    Bot["never (⊥)"]
    Int["int"]
    Str["string"]
    IS["int|string"]
    ISN["int|string|null"]
    Bot --> Int
    Bot --> Str
    Int --> IS
    Str --> IS
    IS --> ISN

    style Bot fill:#fdd,stroke:#a44
    style ISN fill:#dfd,stroke:#4a4

Meet moves down this lattice; the meet of two types is the highest node that sits below both.

Properties

The laws chapter checks all of these on every CI run:

Idempotence: $\tau \sqcap \tau \equiv \tau$.
Commutativity: $\tau \sqcap \sigma \equiv \sigma \sqcap \tau$.
Associativity: $(\tau \sqcap \sigma) \sqcap \rho \equiv \tau \sqcap (\sigma \sqcap \rho)$.
Identity: $\tau \sqcap \top \equiv \tau$.
Annihilator: $\tau \sqcap \bot \equiv \bot$.
GLB property: For all $\rho$, $\rho \mathrel{<:} \tau$ and $\rho \mathrel{<:} \sigma$ implies $\rho \mathrel{<:} (\tau \sqcap \sigma)$.

The GLB property is the soundness interlock with refines: meet must produce a lower bound, and any other lower bound must refine it.

See also: refines for the per-Element subtyping; join for the dual operation; narrow for the assertion-aware variant; subtract for the difference operation that meet uses internally for negations; laws for the algebraic checks.

Least upper bound: join

The join $\tau \sqcup \sigma$ is the smallest type that both $\tau$ and $\sigma$ refine. PHP-side: the union, with absorption and merging applied.

What "least upper bound" means

Among all types $\rho$ such that $\tau \mathrel{<:} \rho$ and $\sigma \mathrel{<:} \rho$:

$\rho = \mathit{mixed}$ always works.
The lattice's job is to find the smallest such $\rho$.

Operationally: join is the type whose value-set is the union of $\tau$'s and $\sigma$'s value-sets, expressed as concretely as the kind system allows after canonicalisation.

Examples:

$\tau$	$\sigma$	$\tau \sqcup \sigma$
`int`	`string`	`int\|string`
`int(0)`	`int(1)`	`int<0,1>`
`int<0,10>`	`int<5,15>`	`int<0,15>`
`true`	`false`	`bool`
`int(0) \| int(1) \| ... \| int(9)`	--	`int<0,9>` (range collapse threshold)
`Foo`	`Bar`	`Foo\|Bar` (no canonical class union)
`int`	`int\|string`	`int\|string` (subsumption)
`never`	`int`	`int`
`mixed`	`int`	`mixed`

How join is computed

Conceptually: gather every Element from both sides, then canonicalise.

The canonicalisation rules apply in order:

Top wins: if mixed is in the union, the result is mixed and everything else is dropped.
Bottom is absorbed: if never is in the union, drop it (unless it's the only thing).
Void canonicalises to null in non-degenerate unions: in any union of length > 1, void is replaced by null (and not duplicated if null is already present). PHP runtime: a void function returns the value null to its caller, so at every value-flow site void is observationally null. void alone is preserved so a : void return-type annotation round-trips. Examples: void | int collapses to int | null ; void | null collapses to null ; void | never collapses to void (the never is dropped first).
Bool composition: true | false = bool. bool | true = bool. bool | false = bool.
Resource composition: open-resource | closed-resource = resource (when no specific kind).
Same-kind dominator: same-kind Elements may collapse to a single range or merged unspecified form.
Range merging: adjacent or overlapping integer ranges merge.
Literal collapse: many literals of the same kind collapse to a range or unspecified Element when they exceed a threshold.
Family-level absorption: int | int<0,10> = int. string | non-empty-string = string. Subsumption applied symmetrically.
Subtype absorption: any Element subsumed by a structurally larger Element in the same multiset is dropped (Foo | Bar where Bar extends Foo collapses to Foo ; the rule consults the world).

Range merging

Joining adjacent or overlapping integer ranges merges them:

$\tau$	$\sigma$	$\tau \sqcup \sigma$
`int<0,5>`	`int<3,10>`	`int<0,10>` (overlap)
`int<0,5>`	`int<6,10>`	`int<0,10>` (adjacent)
`int<0,5>`	`int<10,15>`	`int<0,5> \| int<10,15>` (gap)
`int(7)`	`int<8,10>`	`int<7,10>`
`int<0,∞>`	`int<-∞,5>`	`int` (unspecified)

The rule applies symmetrically: literals join with ranges that contain or are adjacent to them; ranges that span the entire integer line collapse to int.

Literal collapse

A union of many literals of the same kind collapses. The thresholds are conservative: a small number of literals stays as a union (so the analyser can constant-fold), but a large union widens.

Int literals: ten literals of the same range form a range; many literals form unspecified int.
Float literals: similar, with a higher threshold (floats have less natural "tight" structure).
String literals: more than a small number of literals collapses to unspecified string with the join of their axes (e.g. all literals satisfy lowercase → result is lowercase-string).

The defaults err on the side of widening eagerly; that prevents the analyser from carrying around 1000-element unions that no human will ever look at.

Subtype absorption

If two Elements are in the union and one refines the other, the more specific one is dropped:

int | int<0,10>                    → int
Foo | Bar                          → Foo  (when Bar extends Foo)
literal "hello" | non-empty-string → non-empty-string

The absorption is not applied for coercion-driven refinements. int $\mathrel{<:}$ float via PHP runtime coercion does not drive absorption: int|float stays distinct so the analyser knows the value was originally typed as int.

Object union

Joining two named objects with no subtype relationship gives back the union:

Foo | Bar → Foo|Bar  (when neither extends the other)

If one extends the other, subtype absorption fires:

Foo | Bar → Foo  (when Bar extends Foo)

If both extend a common ancestor, the lattice does not automatically widen to the ancestor. Given class A {} class B extends A {} class C extends A {}, B | C → B|C, not A. This is conservative but correct: widening to the ancestor would lose the information that the value cannot be class D extends A.

The analyser can request widening explicitly via the inspection layer, which does walk up the hierarchy.

Sealed shape join

Joining two sealed shapes:

array{a: int} ⊔ array{b: string} → array{a?: int, b?: string}

Each key on the left becomes optional in the result (it might not be present on the right side); each key on the right becomes optional in the result (similarly). Common keys with different value types use the per-value join.

A sealed shape joined with a generic array decomposes the shape and joins:

array{a: int} ⊔ array<string, bool> → array<string|'a', int|bool>

A worked example

int | string joined with int is int | string ; the right side refines the int Element on the left, which absorbs it.

int<0, 5> joined with int<3, 10> is int<0, 10> ; the ranges overlap and merge.

true joined with false is bool ; the bool-composition rule fires.

Visualising the lattice

graph BT
    Bot["never (⊥)"]
    Int["int"]
    Str["string"]
    Lit3["int(3)"]
    Lit5["int(5)"]
    R35["int<3,5>"]
    IS["int|string"]
    Top["mixed (⊤)"]

    Bot --> Lit3
    Bot --> Lit5
    Lit3 --> R35
    Lit5 --> R35
    R35 --> Int
    Int --> IS
    Str --> IS
    IS --> Top

    style Bot fill:#fdd,stroke:#a44
    style Top fill:#dfd,stroke:#4a4

Join moves up this lattice; the join of two types is the lowest node that sits above both.

Properties

The laws chapter checks:

Idempotence: $\tau \sqcup \tau \equiv \tau$.
Commutativity: $\tau \sqcup \sigma \equiv \sigma \sqcup \tau$.
Associativity: $(\tau \sqcup \sigma) \sqcup \rho \equiv \tau \sqcup (\sigma \sqcup \rho)$.
Identity: $\tau \sqcup \bot \equiv \tau$.
Annihilator: $\tau \sqcup \top \equiv \top$.
LUB property: For all $\rho$, $\tau \mathrel{<:} \rho$ and $\sigma \mathrel{<:} \rho$ implies $(\tau \sqcup \sigma) \mathrel{<:} \rho$.
Absorption: $\tau \sqcap (\tau \sqcup \sigma) \equiv \tau$ and $\tau \sqcup (\tau \sqcap \sigma) \equiv \tau$.

The LUB property is the soundness interlock with refines: join must produce an upper bound, and any other upper bound must contain it.

See also: meet for the dual; refines for the per-Element subtyping that drives subsumption; laws for the algebraic checks.

Set difference: subtract

The subtract $\tau \setminus \sigma$ removes from $\tau$ the values that are in $\sigma$. PHP-side: the type after a negative narrowing — $x of type int|string after is_int($x) returns false becomes string.

What "set difference" means

Subtract is the type whose value-set is $\tau \setminus \sigma$, expressed as concretely as the kind system allows.

Examples:

$\tau$	$\sigma$	$\tau \setminus \sigma$
`int\|string`	`int`	`string`
`int\|string\|null`	`null`	`int\|string`
`int<0,10>`	`int<5,15>`	`int<0,4>`
`bool`	`true`	`false`
`mixed`	`null`	`non-null mixed`
`Foo\|Bar`	`Foo`	`Bar`
`non-empty-list<int>`	`list{}`	`non-empty-list<int>` (no overlap to remove)
`int`	`int`	`never`

How subtract is computed

For each Element on the left, attempt to subtract every Element on the right; collect the results.

Per-pair subtract may split a single Element into multiple pieces:

int<0,10> \ int(5) produces int<0,4> | int<6,10> (range punching).
bool \ true produces false.
mixed \ null produces non-null mixed.

Element-level dispatch

For a per-pair subtract $a \setminus b$:

Reflexivity — $a \setminus a = \emptyset$ (the input is fully removed).
never on the right — $a \setminus \bot = a$ (nothing to remove).
never on the left — $\bot \setminus b = \bot$.
No overlap — if $a$ and $b$ are disjoint, $a \setminus b = a$.
Subsumption — if $a \mathrel{<:} b$, then $a \setminus b = \emptyset$ (everything in $a$ is removed).
Family-specific subtract rules — last resort.

The early exit on disjoint pairs is essential for performance: most subtract calls in an analyser are negative narrowings on small types where most pairs are disjoint.

True-union dominator subtract

The scalar, numeric, and array-key Elements are true unions (disjoint covers). When the right-hand side lands inside one of their members, the left-hand side decomposes:

scalar      = bool | int | float | string
numeric     = int | float | numeric-string
array-key   = int | string

scalar \ int       → bool | float | string
numeric \ float    → int | numeric-string
array-key \ string → int

Family-specific subtract rules

Int

int<0,10> \ int(5) = int<0,4> | int<6,10> ; range punching.
int<0,10> \ int<5,15> = int<0,4> ; the overlap is removed.
int \ literal-int = int (the overlap is one literal among infinitely many; the unspecified set does not measurably shrink).
int \ int = never.

Float

float \ float-literal = float (no special punching for floats; one literal does not measurably change the unspecified set).
float-literal-x \ float-literal-x = never.

String

string \ literal "foo" = string (one literal does not change the unspecified set).
non-empty-string \ "" = non-empty-string (no overlap; no change).
string \ non-empty-string = literal "" (only the empty string remains).
string \ string = never.

Bool, true, false

bool \ true = false.
bool \ false = true.
bool \ bool = never.
true \ false = true (no overlap).

Object

Foo \ Foo = never.
Foo \ Bar = Foo (no overlap if neither extends the other; world-aware).
Foo \ Bar = never if Foo extends Bar (every Foo is a Bar; remove all Foo).
Foo \ Bar = Foo if Bar extends Foo (some Foo is a Bar, but not all; we cannot easily express the difference, so conservatively keep all of Foo).

List, Array

Sealed list / array subtract is shape-by-shape; rarely splits, mostly all-or-nothing.
Generic list / array subtract is per-parameter.
A sealed list minus the empty list returns the original (no overlap; sealed non-empty has no element to remove).

Mixed

mixed \ null = non-null mixed ; the canonical case for nullable narrowing.
mixed \ false = non-falsy mixed (truthy mixed | null).
mixed \ truthy = falsy mixed.

Why subtract may produce many Elements

A single Element minus another may produce zero, one, or many Elements:

Zero: full removal (e.g. int(5) \ int(5)).
One: most cases (the input is unchanged or replaced with a slightly narrower form).
Many: range punching (e.g. int<0,10> \ int(5)), or other splitting.

A worked example

int|string minus int is string ; the Element-by-Element dispatch removes the int Element entirely and leaves string untouched (disjoint from int).

int<0, 10> minus int(5) is int<0, 4> | int<6, 10> ; range punching produces a two-Element result.

bool minus true is false ; the bool family has an exact rule.

When subtract is conservative

PHP types are infinite in some axes (literal strings, literal ints) and finite in others (booleans, enum cases). Subtract is exact on the finite cases (because the kind system can express the complement) and conservative on the infinite cases:

string \ "foo" = string (we can't express "every string except 'foo'" precisely, so we keep string and lose nothing soundness-wise).
int<0,10> \ "foo" = int<0,10> (no overlap; nothing to remove).

Properties

The laws chapter checks:

Bound: $(\tau \setminus \sigma) \mathrel{<:} \tau$ for every $\tau, \sigma$.
Disjoint after: $(\tau \setminus \sigma) \sqcap \sigma \equiv \bot$ when the subtract is exact (and is "best-effort disjoint" when it is conservative).
Identity: $\tau \setminus \bot \equiv \tau$.
Annihilator: $\bot \setminus \sigma \equiv \bot$.
Self: $\tau \setminus \tau \equiv \bot$ (when the subtract can express the full removal).

The Bound property is the most important: subtract must never produce a larger type than the input. If subtract returns $T$ such that $T$ is not a subtype of $\tau$, that is a soundness violation.

See also: meet for the operation that uses subtract internally for negation; narrow for the assertion-aware operation that often boils down to subtract; laws for the algebraic checks.

Narrowing under assertions

The narrow operation $\mathit{narrow}(\tau, \pi)$ refines a type $\tau$ by an assertion $\pi$. PHP-side: the type after a control-flow assertion fires — the type of $x after if ($x instanceof Foo), after if (is_int($x)), after if ($x !== null), and so on.

What "narrow" means

The analyser observes some assertion that holds about a value of type $\tau$. It encodes the assertion as a type $\pi$ — the set of values that satisfy the assertion. Narrow returns the type whose values are exactly the values of $\tau$ that satisfy $\pi$.

For most assertions, narrow is just meet ($\tau \sqcap \pi$). The cases where narrow does extra work:

Negative assertions that can be expressed as a positive assertion are recognised and routed to subtract instead of meet ($\tau$ is int|string and the assertion is "not int" → subtract int from $\tau$).
Mixed-axis assertions (truthiness, non-null, non-empty) propagate the axis even when the input is not mixed, narrowing kinds on the way (mixed after truthy becomes truthy mixed ; int|null after non-null becomes int).
Unsound input cases are detected and the lattice avoids producing nonsense (narrowing Foo by int is never, not Foo & int).

Operation summary

The relationship to the other operations:

Assertion shape	Operation
Positive: "value satisfies $\pi$"	$\tau \sqcap \pi$ (meet)
Negative: "value does not satisfy $\pi$"	$\tau \setminus \pi$ (subtract)
Mixed-axis (truthy, non-null, etc.)	meet with axis-propagation rules
Class assertion (`instanceof Foo`)	meet, with class-hierarchy rules
Equality assertion (`$x === 5`)	meet with the singleton type

In the analyser layer, the assertion-extraction code chooses the right form. narrow is the operation that runs after the choice is made.

Worked examples

Nullable narrowing

function f(?int $x): int {
    if ($x !== null) {
        return $x;  // here $x has type int (not int|null)
    }
    return 0;
}

The assertion is non-null mixed. Meet of int|null with non-null mixed — int satisfies non-null (kept), null does not (dropped). Result: int.

Boolean narrowing

function f(bool $x): true {
    if ($x === true) {
        return $x;  // $x has type true
    }
    throw new RuntimeException();
}

The assertion is true. Meet of bool with true is true (the literal-true Element).

Class narrowing

function f(object $x): Foo {
    if ($x instanceof Foo) {
        return $x;  // narrows to Foo
    }
    throw new RuntimeException();
}

The assertion is Foo. Meet of object (the unconstrained object) with Foo is Foo.

Class anti-narrowing

function f(Foo|Bar $x): Bar {
    if (!($x instanceof Foo)) {
        return $x;  // subtracts Foo, leaves Bar
    }
    throw new RuntimeException();
}

The negative assertion subtracts Foo from Foo|Bar, leaving Bar.

Truthiness narrowing

/**
 * @return non-empty-string
 */
function f(int|string $x): string {
    if (is_string($x) && strlen($x) > 0) {
        return $x;
    }
    throw new RuntimeException();
}

Two narrowings: first is_string narrows to string; then strlen > 0 narrows to non-empty-string.

The is_string assertion is the type string; meet of int|string with string is string. The strlen > 0 assertion translates to a mixed-axis assertion that the value is non-empty-string; meet with string produces non-empty-string.

Negative assertion handling

The lattice recognises a few negative-assertion patterns and routes them to subtract:

$\mathit{narrow}(\tau, \neg \sigma)$ → $\tau \setminus \sigma$.
$\mathit{narrow}(\tau, \sigma)$ where $\sigma$ is non-null mixed → meet with the axis (no different from positive narrowing).
$\mathit{narrow}(\tau, \sigma)$ where $\sigma$ is the empty type → never (a no-overlap assertion narrows to bottom).

The analyser layer typically constructs the assertion type with the negation already in place, and narrow does the right thing.

Mixed-axis propagation

When the assertion is a narrowed mixed, the lattice propagates the axes through the input's kinds:

non-null axis: drop any input Element that is null or void. Keep everything else.
truthy axis: keep input Elements that structurally guarantee truthy (objects, resources, non-zero literals, non-empty truthy strings, non-empty arrays/lists). For ambiguous Elements (general int, general string), narrow them: int → int<-∞,-1> | int<1,∞>, string → non-empty truthy string.
falsy axis: keep null, void, false, int(0), float(0.0), "", "0", empty arrays/lists. For ambiguous Elements, drop or narrow.
empty: same as falsy.
isset-from-loop: only propagates between mixed Elements; non-mixed inputs lose this axis.

Decomposition of narrowed mixed

A mixed-axis narrowing may produce multiple Elements:

int $\sqcap$ truthy mixed → int<-∞,-1> | int<1,∞> (the 0 is excluded).
string $\sqcap$ falsy mixed → "" | "0" (only the falsy strings remain).

The result is still one type, just with multiple Elements.

A worked example: chained narrowings

function classify(mixed $x): string {
    if (is_int($x)) { return "int"; }
    if (is_string($x) && $x !== "") { return "non-empty string"; }
    if (is_array($x) && count($x) > 0) { return "non-empty array"; }
    return "other";
}

The analyser narrows $x differently in each branch:

After is_int: int.
After is_string && != "": non-empty-string.
After is_array && count > 0: non-empty-array<array-key, mixed>.
The "else" branch sees the negation of each prior assertion: mixed \ int \ non-empty-string \ non-empty-array<array-key, mixed>.

Each narrowing is a narrow call in the analyser; the result feeds the next.

Visualising narrow

flowchart LR
    Input["Input τ"] --> Op[narrow]
    Assertion["Assertion π"] --> Op
    Op --> Result["τ ∩ π<br/>(or τ \ π for negative<br/>assertions)"]

Properties

The laws chapter checks:

Bound: $\mathit{narrow}(\tau, \pi) \mathrel{<:} \tau$.
Reflexivity: $\mathit{narrow}(\tau, \tau) \equiv \tau$.
Annihilator: $\mathit{narrow}(\tau, \bot) \equiv \bot$.
Identity (top assertion): $\mathit{narrow}(\tau, \top) \equiv \tau$.
Idempotence: $\mathit{narrow}(\mathit{narrow}(\tau, \pi), \pi) \equiv \mathit{narrow}(\tau, \pi)$.
Soundness vs meet: $\mathit{narrow}(\tau, \pi) \mathrel{<:} (\tau \sqcap \pi)$ when $\pi$ is the positive form ; the narrow result is at most as large as the meet result, sometimes smaller because the axis-propagation rules tighten further.

See also: meet for the operation narrow specialises; subtract for the negative-assertion path; refines and overlaps for the underlying relations; laws for the soundness checks.

Soundness: the algebraic laws

The lattice operations form an algebra. There are identities every implementation of meet, join, subtract, narrow, refines, and overlaps must satisfy. Suffete checks them on every CI run, on a property-test battery generating thousands of distinct type triples per case.

This chapter lists the laws, explains what each one rules out, and shows the test infrastructure that runs them.

Why laws matter

A lattice implementation that fails one of these is not just imprecise — it is wrong. Downstream analysers will report bugs that aren't there, miss bugs that are, or flip-flop on the same code depending on which order the analyser asks questions. Worse, the bugs are not local: a bug in meet shows up as wrong output in unrelated parts of the analyser that happened to call meet.

Suffete catches them by treating the laws as proofs that must hold on every input. The property tests generate random types and check the laws on them; a single failure stops the build.

The laws

Idempotence

Operation	Law
meet	$\tau \sqcap \tau \equiv \tau$
join	$\tau \sqcup \tau \equiv \tau$
narrow	$\mathit{narrow}(\tau, \tau) \equiv \tau$

A type combined with itself is itself. Failure means: meet/join/narrow is producing a type that is structurally different from the input despite the inputs being equal.

Commutativity

Operation	Law
meet	$\tau \sqcap \sigma \equiv \sigma \sqcap \tau$
join	$\tau \sqcup \sigma \equiv \sigma \sqcup \tau$
overlaps	$\mathit{overlaps}(\tau, \sigma) = \mathit{overlaps}(\sigma, \tau)$

The order of arguments to a symmetric operation doesn't matter. Failure means: the operation is privileging the left or right argument.

Associativity

Operation	Law
meet	$(\tau \sqcap \sigma) \sqcap \rho \equiv \tau \sqcap (\sigma \sqcap \rho)$
join	$(\tau \sqcup \sigma) \sqcup \rho \equiv \tau \sqcup (\sigma \sqcup \rho)$

Grouping doesn't matter. Failure means: the canonicalisation step is missing some pair.

Identity

Operation	Law
meet	$\tau \sqcap \top \equiv \tau$
join	$\tau \sqcup \bot \equiv \tau$
subtract	$\tau \setminus \bot \equiv \tau$
narrow	$\mathit{narrow}(\tau, \top) \equiv \tau$

The top is the meet identity, the bottom is the join identity. Subtracting nothing leaves the input alone. Narrowing by no assertion leaves the input alone.

Annihilator

Operation	Law
meet	$\tau \sqcap \bot \equiv \bot$
join	$\tau \sqcup \top \equiv \top$
subtract	$\bot \setminus \sigma \equiv \bot$
narrow	$\mathit{narrow}(\tau, \bot) \equiv \bot$
narrow	$\mathit{narrow}(\bot, \pi) \equiv \bot$

Bottom annihilates meet; top annihilates join. Subtracting from bottom leaves bottom. Narrowing by an impossible assertion (or narrowing the empty type) yields bottom.

Absorption

Operation pair	Law
meet/join	$\tau \sqcap (\tau \sqcup \sigma) \equiv \tau$
join/meet	$\tau \sqcup (\tau \sqcap \sigma) \equiv \tau$

The "lattice absorption" identities. These tie meet and join together: a value that is a meet of $\tau$ and a join with $\tau$ is just $\tau$. Failure means: the operations are defined inconsistently with each other.

Bounds (the GLB and LUB properties)

Operation	Law
meet (GLB)	For every $\rho$ such that $\rho \mathrel{<:} \tau$ and $\rho \mathrel{<:} \sigma$, $\rho \mathrel{<:} (\tau \sqcap \sigma)$.
join (LUB)	For every $\rho$ such that $\tau \mathrel{<:} \rho$ and $\sigma \mathrel{<:} \rho$, $(\tau \sqcup \sigma) \mathrel{<:} \rho$.

Meet is the greatest lower bound — any other lower bound refines it. Join is the least upper bound — it refines any other upper bound. These are the soundness interlocks between meet/join and refines.

Failure of the GLB property means: the lattice is computing a meet that is not actually the greatest lower bound — it's leaving a smaller type that still bounds both inputs from below. Imprecise but sound.

Failure in the other direction — meet returning something that is not a lower bound — would be a soundness violation. The bound checks both ways.

Subsumption interlock

Law
$\tau \mathrel{<:} \sigma \iff \tau \sqcap \sigma \equiv \tau$
$\tau \mathrel{<:} \sigma \iff \tau \sqcup \sigma \equiv \sigma$

Subtype is equivalent to "meet is unchanged" or "join is unchanged". A failure means: refines is disagreeing with what meet/join compute, which is a global inconsistency.

Subtract bound

Law
$(\tau \setminus \sigma) \mathrel{<:} \tau$

Subtract never produces a larger type than the input. Failure is a soundness violation — the analyser would think a value has more refinement than it does.

Subtract disjoint after

Law (when subtract is exact)
$(\tau \setminus \sigma) \sqcap \sigma \equiv \bot$

After removing $\sigma$ from $\tau$, what remains is disjoint from $\sigma$. This holds when subtract is exact (the kind system can express the complement); when subtract is conservative, the law holds in the weaker form "approximately disjoint" and is checked with a is_uninhabited or overlaps == false proxy.

Refines/overlaps interlock

Law
$\tau \mathrel{<:} \sigma$ and $\tau$ inhabited implies $\mathit{overlaps}(\tau, \sigma)$

Subtype implies overlap (assuming the subtype is non-empty). Failure means: refines is saying yes while overlaps is saying no, which is a contradiction.

Narrow vs meet

Law (positive assertions)
$\mathit{narrow}(\tau, \pi) \mathrel{<:} (\tau \sqcap \pi)$

Narrow's positive form is at least as tight as meet. Often equal; sometimes strictly tighter (the axis-propagation rules of narrow do work meet doesn't).

How they're checked

A property-test battery generates random worlds (with random class hierarchies and template parameters) and random types exercising every Element kind family, then asserts every law on every triple. Failures shrink to a minimal counter-example and are recorded so future CI runs re-check the same input.

Two batteries cover the law set:

The pair battery runs every pair-shaped law on $(\tau, \sigma)$. Idempotence, commutativity, identity, annihilator, GLB/LUB, subsumption interlock, subtract bound, refines/overlaps interlock.
The triple battery runs the triple-shaped laws on $(\tau, \sigma, \rho)$. Associativity (meet, join), absorption.

Each property-test case calls both batteries on the generated triples.

Why laws beat tests

Hand-written tests check what the author thought to test. The author writes "meet of int and string is never" because that's the obvious case. They miss the edge cases — the nested intersection with a negated head, the sealed shape with an optional uninhabited entry, the narrowed-mixed with the truthy axis interacting with a literal float.

Property tests check what the laws require, and the random generator finds the cases the author didn't think of. A single CI run on a thousand random triples covers more ground than a year of manually-written test cases.

The trade-off is that property tests find counter-examples but don't tell you why. The investigation work is on the human. The shrunk inputs and an instrumented path through the lattice (showing which rule fired on which Element pair) keep that investigation short.

See also: refines, overlaps, meet, join, subtract, narrow — the operations the laws are about.

Template parameters in depth

The universe chapter on templates covered the template-parameter Element kind. This chapter covers what happens with template parameters: how a class's parameters are declared, what the analyser must register about them, and how the lattice consults that information.

Declaration

A class with template parameters is declared in PHP via PHPDoc:

/**
 * @template T
 * @template-covariant V
 * @template K of array-key
 * @template U of Iterator = ArrayIterator
 */
class Box {
    // ...
}

The analyser parses each @template line and registers, for each parameter:

A name (T, V, K, U).
A defining entity (the class Box).
An upper bound (the of X clause; defaults to mixed).
A variance (covariant, contravariant, invariant; defaults to invariant unless declared otherwise).
An optional default (the = X clause; used when the user supplies fewer arguments than declared).

Suffete itself does not store this information. The analyser registers it with its world implementation, and the lattice queries the world when it needs a parameter's variance, upper bound, or default.

What the world supplies

The world tells the lattice three things about a class's template parameters:

The arity (how many @template lines the class declares).
For each position, the parameter's variance, upper bound, and default.
For each (descendant, ancestor, position), the type the descendant supplies to the ancestor's parameter at that position.

The third one is what makes inheritance work: see specialise.

Instantiation

A use-site instantiation Box<int, string> is a named-object Element that carries its type arguments in declaration order. The first argument fills T, the second fills V. If the class declares more parameters than supplied, the missing ones are filled from the upper bound (or the declared default) and the type is flagged as having received a template default ; the lattice tolerates the default at variance check time.

Inheritance and parameter mapping

When class Bag<X> extends Box<X, int>:

Bag's X corresponds to Box's T.
Box's V is bound to int from Bag's perspective.

The lattice uses this when checking Bag<string> refines Box<string, int>: it asks the world what Bag supplies to Box's parameters, substitutes Bag's actual arguments through, and compares positionally with Box's declared variance.

The full algorithm is in specialise.

Defining entities

Every template parameter is keyed by (name, defining_entity). Two parameters with the same name on the same class are the same parameter ; two with the same name on different classes are different parameters. The defining entity can be:

A class-like (the parameter is declared on a class, interface, trait, or enum).
A function or method.
A closure (analyser-assigned identity).

Capture-free substitution uses the defining entity to know which parameters a substitution applies to.

Free vs bound vs partially-applied

A template-parameter Element is free until the analyser substitutes it. Three states:

Free. The parameter appears in the type with no commitment to a value. Box<T>::value is T (free). The lattice can answer questions about T using its constraint as an upper bound.
Bound. The parameter has been substituted. After Box<int>::value, T := int, the field type is int, no T Element remains.
Partially applied / default-filled. The user wrote Box instead of Box<int>. The analyser fills T with the upper bound (mixed by default), flags the type as carrying a template default, and the lattice tolerates the default at variance check time (recording a coercion cause).

A worked example

/**
 * @template T
 * @template-covariant V
 */
class Map {
    /** @var array<T, V> */
    public array $entries = [];

    /**
     * @param T $key
     * @return V
     */
    public function get($key) { return $this->entries[$key]; }
}

/**
 * @extends Map<string, mixed>
 */
class StringMap extends Map {
    /** @var array<string, mixed> */
    public array $entries = [];
}

Inside Map's body, the field $entries has type array<T, V>, where T and V are template-parameter Elements with defining entity Map.

Inside StringMap's body, the field $entries has type array<string, mixed> ; fully concrete because StringMap extends Map<string, mixed>.

When the analyser checks StringMap refines Map<string, mixed>:

The lattice gets the container's parameters: Map's T (invariant by default) and V (covariant, declared with @template-covariant).
It asks the world what StringMap supplies to Map's position 0 ; the answer is string.
It asks the same for position 1 ; the answer is mixed.
Compare position 0 with invariant: string is equivalent to string. ✓
Compare position 1 with covariant: mixed refines mixed. ✓
Result: StringMap refines Map<string, mixed>. ✓

The variance, the inheritance binding, and the per-position check are all driven by the world. Suffete itself orchestrates the dispatch.

See also: variance for the per-variance refinement rules; substitute for how T is replaced with a concrete type; standin for inferring T from call-site arguments; specialise for the inheritance-binding resolution.

Variance

The variance of a template parameter governs whether a subtype substitution at the parameter position produces a subtype of the parameterised type.

Variance	Rule	Source syntax
Covariant	$\sigma \mathrel{<:} \sigma'$ implies $\mathrm{C}\langle\sigma\rangle \mathrel{<:} \mathrm{C}\langle\sigma'\rangle$	`@template-covariant T`
Contravariant	$\sigma \mathrel{<:} \sigma'$ implies $\mathrm{C}\langle\sigma'\rangle \mathrel{<:} \mathrm{C}\langle\sigma\rangle$	`@template-contravariant T`
Invariant	only $\sigma \equiv \sigma'$ produces $\mathrm{C}\langle\sigma\rangle \equiv \mathrm{C}\langle\sigma'\rangle$	`@template T` (default)

Why variance matters

PHP's type system without variance is rigid: Box<int> $\mathrel{<:}$ Box<mixed> only when the analyser hand-codes the rule. With variance declarations, the lattice answers correctly and uniformly:

Read-only containers (e.g. Iterator<T>) are naturally covariant: an Iterator<int> is an Iterator<mixed> (you can read ints out of it as mixeds). PHP-side: @template-covariant T.
Write-only containers (e.g. a Sink<T> accepting T values) are naturally contravariant: a Sink<mixed> accepts an int (because it accepts everything). So Sink<mixed> $\mathrel{<:}$ Sink<int>. PHP-side: @template-contravariant T.
Read-and-write containers (e.g. Box<T> with both get(): T and set(T): void) are invariant: changing T would break either the getter or the setter. PHP-side: @template T (default).

How the lattice uses variance

When the lattice decides $\mathrm{C}\langle a_1, \dots, a_n\rangle \mathrel{<:} \mathrm{C}\langle b_1, \dots, b_n\rangle$ (same class on both sides), it asks the world for each parameter's variance, and compares positionally:

Covariant: $a_i \mathrel{<:} b_i$.
Contravariant: $b_i \mathrel{<:} a_i$.
Invariant: $a_i \mathrel{<:} b_i$ AND $b_i \mathrel{<:} a_i$.

If every position passes, the whole refines query passes.

Variance for inherited classes

When the input is a descendant of the container's class, the lattice asks the world to resolve what the descendant supplies to each ancestor parameter. The variance check then runs on the resolved arguments.

For example, with class StringList extends ArrayList<string> and class ArrayList<T> implements Iterator<T>:

StringList <: Iterator<string>
  ↓
StringList → ArrayList<string>     (StringList extends ArrayList<string>)
  ↓
ArrayList<string> → Iterator<string>  (ArrayList implements Iterator<T>, T := string)
  ↓
Iterator<string> <: Iterator<string>  ✓ (covariant or invariant, both pass)

The lattice's object family handles this chain.

Variance and substitution

Variance does not affect substitution itself ; substitution always replaces every template-parameter Element matching the substitution key, regardless of variance. Variance affects the use site: when the analyser asks "is this instance a subtype of that?", the variance is consulted to decide which direction of refines to check.

Default-filled type-args and variance

When a class declares a parameter and the user supplies fewer arguments than declared, the missing args are filled with their declared upper bounds (or mixed if undeclared). The resulting type is flagged as carrying a template default.

The variance check at refinement time consults the flag:

A default-filled type-arg flowing into a covariant or invariant position is allowed without strict refinement; the lattice records the coercion cause so the analyser can warn.
A default-filled type-arg flowing into a contravariant position is checked normally.

This rule prevents Box (the parameter-less reference) from being non-refining in Box<int> callsites, while still letting the analyser surface the use of an unpinned parameter in its diagnostics.

A worked example

/**
 * @template-covariant T
 */
interface Iterator {
    /** @return T */
    public function current(): mixed;
}

/**
 * @implements Iterator<int>
 */
class IntIterator implements Iterator { /* ... */ }

function f(Iterator $it): void { /* ... */ }

f(new IntIterator());

Checking IntIterator refines Iterator<numeric>:

IntIterator is not the same class as Iterator, so the lattice walks the inheritance.
IntIterator implements Iterator<int> ; the world resolves position 0 of Iterator from IntIterator to int.
The container's Iterator declares position 0 as covariant.
Covariant: check int $\mathrel{<:}$ numeric. ✓ (int is in the numeric true-union).
Result: IntIterator refines Iterator<numeric>. ✓

If Iterator's T were invariant, the answer would be different: int $\equiv$ numeric is false (numeric admits float and numeric-string, int does not), so the call would fail.

Bivariant / unsafe-covariant

PHP-side, neither is supported. Suffete does not have a fourth variance kind; the standard three are sufficient for what real-world PHP analysers express.

Multiple parameters

When a class has multiple parameters, each has its own variance. Map<K, V> typically declares K as invariant (because keys are used both to read and write) and V as covariant (because the user typically reads values out). The variance check runs per-position with the per-position variance.

flowchart LR
    Container["Map&lt;K, V&gt;<br/>K invariant, V covariant"] --> Pos0[K: invariant check]
    Container --> Pos1[V: covariant check]
    Pos0 --> Result[All positions pass → refines]
    Pos1 --> Result

See also: refines for the object-family rules that consult variance; specialise for inheritance-binding resolution.

Substitution

Substitution replaces a template parameter with a concrete type. PHP-side: when the user writes Box<int> and the analyser instantiates Box's body, every T becomes int.

Capture-free

Substitution is capture-free: it walks the type tree and applies the substitution only to free parameters that match the substitution's keys, identified by (name, defining_entity). A parameter declared on a different entity but sharing the same name is not substituted.

In practice:

A template parameter with defining entity Box and a substitution targeting Box::T is substituted.
A template parameter with defining entity Foo is not substituted by the same substitution, even if it is also called T.

This keeps substitution semantically clean even when types nest other generic uses.

How substitution walks the tree

Substitution is a structural transform. It recurses into every nested type carrier:

An object's type arguments.
A list's element type and known elements.
A keyed array's key parameter, value parameter, and known items.
An iterable's key and value types.
An object shape's known properties.
A callable's parameter types, return type, and throws.
A class-like-string's constraint.
A template parameter's constraint (the parameter Element itself is the substitution target).
An alias or reference's type arguments.
A conditional's subject, target, then, and otherwise branches.
A derived type's nested types.
A negation's inner.
An intersection's head and conjuncts.

At each leaf (every template-parameter Element), the substitution table is consulted. If the table has a binding, the type is substituted in place; otherwise the parameter is kept.

Substitution and unions

A substitution may replace one parameter with a union type. The walker handles this correctly: substituting T := int|string into the type T|null produces int|string|null, with the substitution flat-merged into the parent union (the lattice's join is run to collapse).

Substitution into nested generic uses

When the type contains other generic uses (e.g. class-string<T> or Box<T>), substitution walks into them and replaces the substituted parameter wherever it appears. So substituting T := int into Box<T> | class-string<T> | T produces Box<int> | class-string<int> | int.

Identity short-circuit

If the substitution produces no change (no bound parameter is found in the input), substitution returns the original type ; no re-interning, no allocation. This is the common case in the analyser when a callsite happens not to bind any parameter the substitution targets.

The walker maintains this guarantee even through deep recursion: as long as no leaf changes, the parents propagate the original handles up.

Multi-step substitution

A single substitution call can carry bindings for several parameters at once, keyed by (name, defining_entity). The walker visits each parameter once and looks up its binding in the table.

A worked example

/**
 * @template T
 * @template V
 */
class Map {
    /** @var array<T, V> */
    public array $entries = [];

    /**
     * @param T $key
     * @return V
     */
    public function get($key) { /* ... */ }
}

When the analyser instantiates Map<string, int> and asks for the concrete type of $entries:

The declared field type is array<T, V>.
The analyser builds a substitution: Map::T := string, Map::V := int.
Substitution walks array<T, V>, finds the two parameter Elements, replaces them.
The result is array<string, int>.

A different class with its own T would not be touched by this substitution, because the defining entity differs.

See also: Templates in depth for the parameter Element kind; standin for the inverse direction (collecting bounds from arguments); specialise for inheritance binding.

Inference: standin and infer

When the user calls a generic function or method without supplying explicit type arguments — id($x) rather than id<int>($x) — the analyser must infer the type arguments from the call-site arguments. Two primitives do the work:

standin — a single inference round: walk a parameter type against an argument type, and accumulate the bounds the argument imposes on each free template parameter.
infer — apply a fully-determined template environment to a type.

standin: one round of inference

standin is one round — one parameter, one argument, one walk. The analyser typically runs standin once per call-site argument, against the matching parameter type, before reading the accumulated environment.

The standin environment

The environment collects two kinds of bound for each template parameter:

Lower bounds ($\sigma \mathrel{<:} T$): types the argument provided that the parameter must accept.
Upper bounds ($T \mathrel{<:} \sigma$): types the parameter must produce that the argument is forced to expose.

The "upper" / "lower" terminology comes from the lattice direction:

graph BT
    Lower["argument-side type<br/>(lower bound on T)"] --> T["T (the parameter)"]
    T --> Upper["constraint type<br/>(upper bound on T)"]

A simple call id<T> invoked with id(7) accumulates a lower bound int <: T (the argument forces T to be at least int).

How standin walks

Like substitution, standin is a structural recursion. It walks the parameter type and the argument type together, in lockstep, and at each leaf — every template-parameter Element on the parameter side — it records a bound.

The recursion respects the same nesting substitution does: object type-args, list element types, callable signatures, intersections, conditionals, and so on.

When the parameter and argument structures don't align (a parameter expecting array<K, V> matched against an argument of type int), the walker silently fails to record bounds — there is no way to derive a bound from a structural mismatch. The analyser sees an empty (or partial) environment and reports a type error in its own layer.

infer: applying a determined environment

After all standin rounds complete, the analyser has an environment containing every bound. To produce the inferred type arguments, it:

For each parameter T with one or more lower bounds, joins them: T := lo_1 ⊔ lo_2 ⊔ … ⊔ lo_n.
Verifies the inferred T against any upper bounds: each T <: hi_i must hold.
Verifies the inferred T against the parameter's declared upper bound (the constraint).
Substitutes the inferred values back into the function's return type.

infer is the convenience that walks a type and substitutes every template-parameter Element using the joined lower bounds in the environment. Parameters with no lower bound are filled with their declared upper bound (or mixed).

Worked example: identity function

/**
 * @template T
 * @param T $x
 * @return T
 */
function id($x) { return $x; }

id(7);          // T inferred to int(7) (or generalised to int).
id("hello");    // T inferred to literal "hello" (or generalised to string).
id($foo);       // where $foo: Foo. T inferred to Foo.

For id(7):

The parameter type is T; the argument type is int(7).
standin walks them together; the parameter side is the T Element, so the environment records T → lower bound = int(7).
infer substitutes the function's return type (T) with the joined lower bound, producing int(7).

Worked example: collection

/**
 * @template T
 * @param array<T> $xs
 * @return T
 */
function head(array $xs) { /* ... */ }

head([1, 2, 3]);                  // T → int<1,3>
head(['a', 'b']);                 // T → 'a'|'b' (or generalised to string)
head([new Foo(), new Bar()]);     // T → Foo|Bar

For head([1, 2, 3]):

The parameter type is array<T>; the argument type is array<int, int<1,3>> (the literal array's inferred type).
standin recurses into the array's value parameter: parameter side is T, argument side is int<1, 3>.
The environment records T → lower bound = int<1, 3>.
infer substitutes the return type T with int<1, 3>.

Worked example: multiple arguments

/**
 * @template T
 * @param T $a
 * @param T $b
 * @return array{0: T, 1: T}
 */
function pair($a, $b): array { /* ... */ }

pair(1, "hello");  // T → int(1)|"hello" (generalised to int|string)

The analyser runs standin once per argument:

First argument: parameter T, argument int(1). Environment: T → [int(1)].
Second argument: parameter T, argument "hello". Environment: T → [int(1), "hello"].

infer joins the lower bounds, producing int(1) | "hello".

Variance and standin

Variance affects which side of the lattice direction standin records the bound on:

Covariant position: argument is a lower bound on T.
Contravariant position: argument is an upper bound on T.
Invariant position: argument is both a lower and upper bound (must be exact match).

flowchart LR
    Cov["param: Iterator&lt;T&gt; (T covariant)<br/>arg: Iterator&lt;int&gt;"] --> CovBound["T ≥ int (lower bound)"]
    Contra["param: Sink&lt;T&gt; (T contravariant)<br/>arg: Sink&lt;int&gt;"] --> ContraBound["T ≤ int (upper bound)"]
    Inv["param: Box&lt;T&gt; (T invariant)<br/>arg: Box&lt;int&gt;"] --> InvBound["T = int (both)"]

The standin walker queries the world for the variance of each parameter when descending into a parameterised type.

Why two operations

Conceptually, inference is one operation: "given parameter and arguments, return the inferred environment". The split into standin (per-argument) and infer (final synthesis) lets the analyser:

Run multiple standin rounds across multiple arguments.
Inject extra bounds from other sources (e.g. a hand-written @template T of int on the call site).
Verify intermediate states (e.g. abort inference early if a contradiction is found).
Apply the result to multiple types (the return type, the throws clause, derived types).

A subtle case: variables in arguments

Suppose the analyser is inferring inside a generic function body:

/**
 * @template T
 * @param T $x
 * @return T
 */
function id($x) { return $x; }

/**
 * @template U
 * @param U $y
 * @return U
 */
function wrap($y) { return id($y); }

For wrap's body call id($y):

The argument is $y of type U (a template-parameter Element).
The parameter is id::T (a different template-parameter Element).
standin records: id::T → lower bound = U.

The lower bound is itself a template-parameter Element (since U is still free in wrap's body). The inferred return type for id($y) is U. When wrap is later invoked with a concrete argument, the chain resolves.

See also: substitute for the substitution operation infer uses internally; specialise for inheritance-binding resolution; variance for how variance is consulted during walking.

Inheritance specialisation

When StringList extends ArrayList<string> is checked against Iterator<string>, the lattice has to compute what StringList supplies to Iterator's template parameter. The chain — StringList → ArrayList<string> → Iterator<T> (where ArrayList implements Iterator) — has to be unwound, and T has to be resolved through every step.

This is inheritance specialisation, written formally as $\mathit{specialise}(C, T, D\langle\bar\rho\rangle)$: given that descendant D is instantiated with arguments $\bar\rho$, what is the type of ancestor C's template parameter T?

What the world supplies

The lattice cannot resolve specialisation alone ; it requires codebase knowledge of which classes extend which, with what type arguments. The world answers, for any (descendant, ancestor, position), the type the descendant supplies to that parameter, expressed in the descendant's own template namespace.

When the lattice walks D <: C<\bar\rho>, it asks the world for each position of C's parameters and substitutes through.

What "in the descendant's namespace" means

The world's answer may itself contain template-parameter Elements that refer to the descendant's parameters. For example:

/**
 * @template X
 * @implements Iterator<X>
 */
class Bag implements Iterator { /* ... */ }

Bag says: "I implement Iterator<X>, where X is my parameter." The world's answer for Bag's contribution to Iterator's position 0 is a template-parameter Element for Bag::X.

When the lattice is checking Bag<int> <: Iterator<int>:

Ask the world for Bag's contribution to Iterator's position 0 ; the answer is Bag::X.
Substitute Bag's actual arguments through the answer: Bag::X := int → result is int.
Compare with the container's int, with Iterator's declared variance.

The substitution in step 2 is a substitute call with the binding Bag::X := int.

A multi-step chain

/**
 * @template T
 */
interface Iterator { /* ... */ }

/**
 * @template T
 * @implements Iterator<T>
 */
class ArrayIterator implements Iterator { /* ... */ }

/**
 * @template U
 * @extends ArrayIterator<U>
 */
class TypedList extends ArrayIterator { /* ... */ }

For TypedList<string> <: Iterator<string>:

Walk: TypedList → Iterator is not direct.
The world records: TypedList extends ArrayIterator. So TypedList's contribution to ArrayIterator's position 0 is TypedList::U.
The world records: ArrayIterator implements Iterator<T>. So ArrayIterator's contribution to Iterator's position 0 is ArrayIterator::T.
Compose: TypedList's contribution to Iterator's position 0 is the composed result, TypedList::U. (The world walks the chain on the analyser's behalf.)
The lattice substitutes TypedList's actual arguments: TypedList::U := string.
Result: string.
Compare with the container's Iterator<string> ; string $\equiv$ string (or covariant), passes.

Variance through the chain

Variance is per-parameter on each ancestor. The lattice consults the variance of Iterator's T (covariant), not TypedList's U (which has its own variance, declared independently).

This is correct: when checking TypedList<X> <: Iterator<Y>, the question is whether Iterator accepts the supplied parameter at the supplied variance, not whether TypedList's parameter is somehow compatible with Iterator's.

The same-class case

When the input and container are the same class, no chain walk is needed:

Box<int> <: Box<numeric>

The lattice asks Box's parameter variance and compares positionally. With T covariant: int <: numeric ✓. With T invariant: int $\equiv$ numeric is false (numeric admits float, int does not), the refines fails.

Resolving arity differences

If an ancestor declares more parameters than the descendant, the descendant must supply values for all of them. If an ancestor declares fewer, the descendant cannot supply more.

The world handles this in its inheritance-mapping implementation; the lattice does not enforce arity here.

When the world has no mapping

If the world has no mapping for a given (descendant, ancestor, position), the lattice falls back to using the parameter's upper bound (or mixed) for that position.

This is the conservative answer: the analyser couldn't prove the inheritance, so don't enforce a tight refinement.

A worked example

/**
 * @template T
 * @template-covariant V
 */
interface Map {
    /**
     * @param T $key
     * @return V
     */
    public function get($key);
}

/**
 * @template W
 * @implements Map<string, W>
 */
class StringMap implements Map { /* ... */ }

For StringMap<int> <: Map<string, int>:

Input class StringMap is not the same as container class Map.
The world says StringMap descends from Map.
The world supplies StringMap's contribution to Map's position 0: string (from @implements Map<string, W>, position 0).
The world supplies StringMap's contribution to Map's position 1: StringMap::W (the W from the @implements clause).
The lattice substitutes StringMap's actual args: StringMap::W := int. The position-1 result becomes int.
Compare positions:
- Position 0: string $\equiv$ string (Map's T is invariant; both pass).
- Position 1: int <: int (Map's V is covariant; passes).
Result: StringMap<int> <: Map<string, int>. ✓

Why specialisation lives outside the lattice

Specialisation requires codebase knowledge: which classes extend which, with what type arguments. The lattice itself is codebase-agnostic; it asks the world, which is the analyser's responsibility. Specialisation is the protocol between the lattice and the world for the inheritance case.

This separation is what lets the lattice's correctness be checkable in isolation: the inheritance mapping is a black-box function the world supplies, and the lattice's behaviour is correct given the world's answers. The world's answers themselves are the analyser's correctness concern.

See also: Templates in depth for the parameter Element kind; variance for the rules each position is checked under; substitute for the substitution applied during specialisation.

TypeId, ElementId, and identity

Suffete is a handle-based crate. Every interesting value — a Type, an Element, an interned string of element IDs, a parameter list, a known-items list, a defining entity — is referenced by a small NonZero handle. The actual data lives in process-global arenas. Handles are cheap to compare, hash, copy, and pass around. The arenas are append-only and never freed.

This chapter covers the four handles a user of the public API will see most.

`ElementId`

The interned handle to a single Element. Layout: NonZeroU32, packed as

[kind tag: 6 bits] [arena slot: 26 bits]

The high 6 bits encode the ElementKind (1..=63 ; 64 values reserved, 30+ used). The low 26 bits encode the per-kind arena slot (0..=2^26-1 ≈ 67 million, more than enough).

Two ElementIds compare equal iff they refer to the same canonical Element ; this is the interner's contract. Equality is one u32 compare; hashing is trivial.

use suffete::{ElementId, ElementKind};

// ElementId is Copy, Eq, Hash, Ord
let id: ElementId = suffete::prelude::INT;
assert_eq!(id.kind(), ElementKind::Int);

The kind() method is (id.raw() >> 26) as u8 cast through ElementKind. Constant-time, branch-free.

The view() method resolves the handle to the borrowed Element view, which carries a &'static SomeInfo for payload-bearing kinds:

use suffete::Element;
let elem: Element = id.view();
match elem {
    Element::Int(info) => { /* info: &'static IntInfo */ }
    Element::Object(info) => { /* info: &'static ObjectInfo */ }
    // ...
    _ => {}
}

For trivial kinds (no payload) the view returns a unit-like variant: Element::Null, Element::Never, Element::Mixed(...), etc.

`TypeId`

The interned handle to a Type: a union of Elements plus a small bag of FlowFlags plus 8 bits of caller-defined meta. Layout: NonZeroU64, packed as

[slot: 32 bits] [flags: 16 bits] [meta: 8 bits] [reserved: 8 bits]

slot — the index into the type-content arena. Two TypeIds with the same slot share the same interned Type (the same element-set).
flags — the FlowFlags bitset. Riding on the handle keeps the arena content-keyed; toggling a flag is bit-twiddling, not a re-intern.
meta — 8 bits of consumer-defined storage. Suffete never inspects it. Use it for tag-style metadata (provenance enum, severity, boolean markers); for anything that needs more bits or indexes a side table, the consumer should keep their own HashMap<TypeId, T>.
reserved — reserved for future suffete use; always zero. Not exposed publicly.

Equality and hashing compare all 64 bits: t1 == t2 means same content AND same flags AND same meta.

use suffete::{TypeId, FlowFlags, prelude::TYPE_INT};

// TypeId is Copy, Eq, Hash, Ord
let t: TypeId = TYPE_INT;
assert_eq!(t.flags(), FlowFlags::EMPTY);
assert_eq!(t.meta(), 0);

For comparison ignoring flags / meta:

let t1 = ...;
let t2 = ...;
assert!(t1.content_eq(&t2));   // same elements, ignore flags/meta

For deriving related handles in O(1) without touching the arena:

let with_flag = t.with_flags(t.flags().with_from_template_default(true));
let with_meta = t.with_meta(7);

The with_* methods return new handles in O(1) ; the underlying arena entry is unchanged.

Resolving a `TypeId` to its content

let view = t.as_ref();
let elements: &'static [ElementId] = view.elements;

The returned &'static reference is a real 'static slice into the per-type element-list arena ; safe to hold for the lifetime of the process.

`ElementListId`

A handle to an interned slice &'static [ElementId]. Used as the type of intersection-conjunct lists, of any-kind element lists stored on a payload.

use suffete::ElementListId;
use suffete::interner::interner;

let elements: &[ElementId] = &[suffete::prelude::INT, suffete::prelude::STRING];
let id: ElementListId = interner().intern_element_list(elements);
let resolved: &'static [ElementId] = interner().get_element_list(id);
assert_eq!(resolved, elements);

ElementListId is also NonZeroU32. Two lists with the same content have the same ElementListId.

`TypeListId`

A handle to an interned slice &'static [TypeId]. Used as the type of object type-args, of derived-info type lists.

Same shape as ElementListId. Same interning guarantee.

Construction discipline

The intended pattern for building a type:

Use TypeBuilder to push elements (and set flags) over a sequence of mutations.
Call build() to intern once.

The TypeBuilder chapter covers the API in detail. Direct interner calls are also possible:

let t = suffete::interner::interner().intern_type(
    &[suffete::prelude::INT, suffete::prelude::STRING],
    suffete::FlowFlags::EMPTY,
);

The interner handles canonicalisation (sort + dedup) and dedup against existing entries.

What guarantees handle equality

For TypeId:

Same (slot, flags, meta) triple → same handle.
Same content (sorted+deduped element list) + same flags + same meta → same handle (because the slot dedups by content).

For ElementId:

Same kind + same payload → same handle.

The interner enforces both. Two TypeIds constructed at different points in time, on different threads, with the same logical inputs, will compare ==. Likewise for ElementId.

This is what makes the lattice fast: two-element comparisons are one u32 apiece. Hashing a TypeId is hashing one u64.

Lifetime guarantees

&'static [ElementId] returned from interner methods is real 'static ; the arena is in a OnceLock for the process lifetime.
&'static SomeInfo returned via Element::Int(info) (etc.) is real 'static.
Atom (interned strings via mago_atom) is also real 'static.

The arenas grow over the lifetime of the process; they never shrink. Re-interning the same value is idempotent.

A worked example

use suffete::{TypeBuilder, ElementId, prelude::{INT, STRING}};

let t1 = TypeBuilder::new().push(INT).push(STRING).build();
let t2 = TypeBuilder::new().push(STRING).push(INT).build();
let t3 = TypeBuilder::new().push(INT).push(STRING).push(INT).build();

assert_eq!(t1, t2);  // sort makes order irrelevant
assert_eq!(t1, t3);  // dedup makes duplicates irrelevant

assert_eq!(t1.as_ref().elements.len(), 2);

See also: Constructing types: TypeBuilder and prelude for the user-facing build API; The ElementId tag layout for the bit-level details; Interning and the arenas for how the storage works.

Constructing types: TypeBuilder and prelude

There are two routes to a TypeId:

TypeBuilder — a mutable scratch buffer. Push elements, change flags, then build() to intern once. The intended path for most analyser-side type construction.
prelude — well-known constants for the common cases (INT, STRING, TYPE_NEVER, TYPE_MIXED, EMPTY_ARRAY, etc.). Use these directly when applicable.

For per-Element construction (named objects, sealed shapes, callable signatures, etc.), the ElementId constructors and the interner are the underlying tools; TypeBuilder is a layer on top.

`TypeBuilder`

use suffete::{TypeBuilder, TypeId, ElementId, FlowFlags};

let mut b = TypeBuilder::new();
b.push(suffete::prelude::INT);
b.push(suffete::prelude::STRING);
let t: TypeId = b.build();

TypeBuilder is a Vec<ElementId> + FlowFlags plus an optional origin TypeId. Mutations are direct vec operations; build() calls the interner once at the end.

Construction

TypeBuilder::new();                 // empty buffer, EMPTY flags
TypeBuilder::from_type(some_t);     // start from an existing TypeId; remembers it as origin

from_type enables the origin short-circuit: if build() is reached with the buffer in the same shape and flags as the origin, it returns the original TypeId without re-interning. Useful for transforms that mostly leave the type alone.

Element mutations

b.push(elem);                         // append
b.extend(iter_of_elems);              // extend
b.remove(elem);                       // remove first occurrence
b.remove_all(elem);                   // remove every occurrence
b.retain(|e| /* keep this? */);       // arbitrary filter
b.replace(old, new);                  // replace first occurrence
b.map(|e| /* transform */);           // in-place per-element map
b.flat_map(|e| /* expand */);         // 1-to-N expansion
b.contains(elem);                     // O(n) lookup (SIMD-accelerated)

Mutations preserve the order they happened in. The interner sorts and dedups on build.

Flag mutations

b.set_flags(FlowFlags::EMPTY.with_from_template_default(true));
b.modify_flags(|f| f.with_from_template_default(true));

Flags are FlowFlags ; a 16-bit bitset. The most relevant in user code:

from_template_default — set by suffete when a type-arg was filled with the parameter's upper bound rather than the user's value. Used by the variance check at refinement time.

The full list is in the reference.

Build modes

let t1 = b.build();             // sort + dedup, intern.
let t2 = b.build_canonical();   // sort + dedup + apply canonicalisation rules from `join`, intern.

build() runs the interner's structural canonicalisation: sort the element list by ElementId, dedup, intern. No subsumption, no range merging, no literal collapse. The result is the input verbatim, just canonicalised.
build_canonical() runs the join's canonicalisation: subsumption, range merging, literal collapse, true-union dominators, etc. Equivalent to lattice::join(t1, TYPE_NEVER, ...) for any t1. Use this when you want the analyser's "official" canonical form.

The build() path is used by the analyser when it knows the input is already canonical or when the caller doesn't want collapses (e.g. preserving int|literal-int as two distinct elements). build_canonical() is used when the analyser wants the smallest possible expression.

Origin short-circuit

let original = ...; // some existing TypeId

let t = TypeBuilder::from_type(original)
    .map(|e| if e == suffete::prelude::INT { suffete::prelude::STRING } else { e })
    .build();

// If the map happened to return every element unchanged, t == original.
// If anything changed, t is a new interned TypeId.

The dirty-tracking is conservative: any mutation that could have changed the buffer flips the dirty bit, even if the mutation was a no-op (e.g. remove(elem) for an elem not in the buffer). The dirty bit only affects the short-circuit; build always produces a correct result.

`prelude`

The prelude exposes well-known constants. Every Element kind that has a singleton trivial form has a constant; many of the common payload-bearing forms have one too.

Element constants (`ElementId`)

use suffete::prelude::*;

NEVER, MIXED, NULL, VOID, PLACEHOLDER,           // landmarks
TRUE, FALSE, BOOL,                                // booleans
INT, FLOAT, STRING, NUMERIC, SCALAR, ARRAY_KEY,  // scalars (unrefined)
NUMERIC_STRING,                                   // numeric-string
NON_EMPTY_STRING, EMPTY_STRING, INT_ZERO,         // common refinements + literals
OBJECT_ANY,                                       // any object
ITERABLE_MIXED_MIXED,                             // iterable<mixed, mixed>
CALLABLE,                                         // bare callable
RESOURCE, OPEN_RESOURCE, CLOSED_RESOURCE,         // resources
EMPTY_ARRAY,                                      // array{}

The complete list is in the prelude reference.

Type constants (`TypeId`)

The TYPE_* constants are one-element types wrapping the corresponding Element:

use suffete::prelude::*;

TYPE_NEVER, TYPE_MIXED, TYPE_NULL, TYPE_VOID,
TYPE_TRUE, TYPE_FALSE, TYPE_BOOL,
TYPE_INT, TYPE_FLOAT, TYPE_STRING, TYPE_NUMERIC, TYPE_SCALAR,
// ...

TYPE_INT is TypeBuilder::new().push(INT).build(), but pre-computed at boot time and exposed as a const. Use the TYPE_* constants when you need a TypeId for a single element ; it saves a build call.

Per-element construction

For Elements that need a payload, use the constructors on ElementId:

ElementId::int_literal(42);                     // int(42)
ElementId::string_literal("hello");             // literal "hello"
ElementId::int_range(IntRange::new(Some(0), Some(100)));  // int<0,100>
ElementId::named_object(atom("Foo"));           // Foo
ElementId::named_object_with_args(atom("Box"), &[TYPE_INT]);  // Box<int>
ElementId::enum_case(atom("Status"), atom("Active"));  // Status::Active
ElementId::generic_parameter(name, defining_entity, constraint);
ElementId::intersected(head, &[conjunct1, conjunct2]);  // head & conjunct1 & conjunct2

For more elaborate Elements (callable signatures, sealed shapes), call the interner methods directly:

use suffete::interner::interner;

let info = ObjectShapeInfo {
    known_properties: Some(...),
    flags: ObjectShapeFlags::default().with_sealed(true),
};
let shape = interner().intern_object_shape(info);

The full set of interner methods is in src/interner/store.rs, generated by the element_arena_methods! macro.

A worked example

Build the PHP type non-empty-list<int|string>|null:

use suffete::{TypeBuilder, ElementId};
use suffete::prelude::{INT, STRING, NULL};
use suffete::element::payload::{ListInfo, ListFlags};
use suffete::interner::interner;

// First the inner element type: int|string
let int_or_string = TypeBuilder::new().push(INT).push(STRING).build();

// Then the list element
let list_elem: ElementId = ElementId::list(int_or_string, /* non_empty: */ true);

// Finally the union with null
let result = TypeBuilder::new()
    .push(list_elem)
    .push(NULL)
    .build();

Three intern calls (one for the inner union's element list, one for the list payload, one for the outer union). All idempotent ; calling this code twice with the same inputs produces the same TypeId both times.

Performance notes

TypeBuilder::new() is Vec::new() ; no allocation until pushing.
push is Vec::push; amortised O(1).
build() allocates a temporary sorted vec; the cost is O(n log n) for the sort plus the interner's hash lookup. The interner has a fast path that detects already-sorted-and-unique input (via simd::is_sorted_strict) and skips the temporary allocation entirely.
Singleton types (build() on a one-element buffer) hit a per-Element cache and skip both sort and the dashmap lookup.

For analyser code that builds many types in a loop, reuse a single TypeBuilder across iterations:

let mut b = TypeBuilder::new();
for elem in elements_to_process {
    b.push(elem);
    let _ = b.build();
    b = TypeBuilder::new();   // or b.set_flags(FlowFlags::EMPTY) ; both reset
}

See also: TypeId, ElementId, and identity for the handle types; Prelude constants for the full list of prelude::*; Lattice options and reports for FlowFlags.

Predicates: is_X, contains_X, and friends

The predicates module is the place for single-call structural questions about a TypeId. Each predicate answers one question and returns a bool. They are pure — no World, no options, no report.

use suffete::predicates::{
    is_int, is_string, is_truthy, contains_null, is_singleton, is_constant_foldable,
    contains_template_anywhere, is_fully_resolved,
};

This chapter walks through the naming conventions, the families, and the cost model.

Naming conventions

The predicate names follow a strict scheme:

is_X(ty) — guaranteed: every Element of ty is in family X. Conservative: false when any Element is outside X, including for never (the all-bottom type).
contains_X(ty) — possible at the top level: at least one top-level Element of ty is in family X.
is_truthy(ty) / is_falsy(ty) — every Element guaranteed truthy / falsy at runtime.
could_be_truthy(ty) / could_be_falsy(ty) — at least one Element could be truthy / falsy.
*_anywhere(ty) — recurses into every nested-type carrier (object args, list element types, callable signatures, etc.). Use these for "does this tree contain any unresolved Element?" or "is there a free template anywhere?".

is_X is false on never: is_int(TYPE_NEVER) = false, because the empty type contains no Int Elements (vacuously, also no non-Int Elements). The conservative reading is what most analyser callers want; if you want vacuous-true, check is_X(ty) || ty == TYPE_NEVER.

Kind-family predicates

For each PHP type family, an is_X and (where useful) a contains_X:

`is_X`	`contains_X`	Family
`is_int`	`contains_int`	`Int`
`is_float`	`contains_float`	`Float`
`is_string`	`contains_string`	`String`
`is_bool`	`contains_bool`	`Bool`, `True`, `False`
`is_null`	`contains_null`	`Null`
`is_void`	`contains_void`	`Void`
`is_list`	—	`List`
`is_keyed_array`	—	`Array`
`is_array`	`contains_array`	`Array`, `List`
`is_iterable`	`contains_iterable`	`Iterable` (the `Iterable` kind only ; not arrays/lists)
`is_object`	`contains_object`	`Object`, `Enum`, `ObjectShape`, `HasMethod`, `HasProperty`, `ObjectAny`
`is_resource`	`contains_resource`	`Resource`
`is_callable`	`contains_callable`	`Callable`
`is_array_key`	—	`ArrayKey`
`is_scalar`	—	`Scalar`, `Int`, `Float`, `String`, `Bool`, `True`, `False`, `ClassLikeString`, `Numeric`, `ArrayKey`
`is_numeric`	—	`Numeric`, `Int`, `Float`
—	`contains_mixed`	`Mixed`

is_X is slice.iter().all(|e| e.kind() in family). contains_X is slice.iter().any(|e| e.kind() in family). Single-kind versions of these (is_int, contains_int, etc.) route through SIMD-accelerated scans; multi-kind versions use the scalar matches.

Truthiness predicates

is_truthy(ty)         // every Element guaranteed truthy
is_falsy(ty)          // every Element guaranteed falsy
could_be_truthy(ty)   // at least one Element could be truthy
could_be_falsy(ty)    // at least one Element could be falsy

The is_* variants are vacuously false for never. The could_be_* variants are also false for never (no Elements, so no possibility).

The truthiness implications are the same as the refinement axes chapter: an Object is always truthy, an empty array is always falsy, int(0) is falsy, int(7) is truthy, int<-∞,-1> | int<1,∞> is truthy, an unconstrained int is could_be both, etc.

Literal predicates

is_literal(ty)            // every Element is a literal-shaped value
                          // (specific int/float/string literal, true, false, null, void)
is_constant_foldable(ty)  // is_literal && is_singleton
                          // — most useful "can I constant-fold this?" check

is_literal is true for types like int(7), "foo" | "bar", true | false, int(0) | int(1). It is false for int, non-empty-string, Foo.

is_constant_foldable adds the singleton requirement: exactly one Element. Use this when the analyser wants to know whether the type can be replaced by a concrete value at this program point.

Structural predicates

is_never(ty)         // ty == TYPE_NEVER
is_mixed(ty)         // ty == TYPE_MIXED (vanilla, no narrowing)
is_singleton(ty)     // exactly one Element
is_union(ty)         // more than one Element

is_mixed is the vanilla mixed test ; it returns false for narrowed mixed variants (non-null mixed, truthy mixed, etc.). To detect any mixed variant, use contains_mixed.

Tree-walking predicates

Three predicates recurse into every nested-type carrier (using the inspect walker):

contains_mixed_anywhere(ty)         // any Mixed (vanilla or narrowed) anywhere in the tree
contains_template_anywhere(ty)      // any free GenericParameter anywhere
contains_placeholder_anywhere(ty)   // any Placeholder anywhere
contains_unresolved_anywhere(ty)    // Alias, Reference, MemberReference, GlobalReference, Conditional, Derived
is_fully_resolved(ty)               // negation of contains_unresolved_anywhere

These are the predicates to call before invoking the lattice on a possibly-unresolved type. The lattice's contract is that it works on resolved inputs; the analyser checks is_fully_resolved and calls expand if not.

Cost model

Predicates are cheap. The dispatch:

The kind-family is_X / contains_X (single kind) variants use SIMD scans with thresholds — sub-nanosecond for short slices, O(n / lane_width) for long ones.
The multi-kind variants (is_bool, is_object, etc.) use scalar matches! with early exit.
The truthiness predicates iterate Elements once and dispatch per-kind. Per-kind cost is constant (one to three comparisons). Total cost is O(n).
The literal predicates iterate Elements once with a per-kind kind-only check. O(n).
The tree-walking *_anywhere predicates use the inspect walker with a short-circuiting predicate. Cost is O(tree size) worst case, but typical short-circuit on the first hit gives O(small) in practice.

None of the predicates allocate. None take a &mut argument. They are safe to call from any context.

A worked example

use suffete::{TypeBuilder, prelude::{INT, STRING, NULL, TRUE}, ElementId};
use suffete::predicates::{
    is_int, is_string, is_singleton, contains_null, is_constant_foldable,
    is_truthy, could_be_falsy,
};

let int_or_string = TypeBuilder::new().push(INT).push(STRING).build();
assert!(!is_int(int_or_string));     // mixed kinds, not all int
assert!(!is_string(int_or_string));  // ditto
assert!(!is_singleton(int_or_string));
assert!(!contains_null(int_or_string));

let nullable_int = TypeBuilder::new().push(INT).push(NULL).build();
assert!(contains_null(nullable_int));

let lit = TypeBuilder::new().push(ElementId::int_literal(7)).build();
assert!(is_singleton(lit));
assert!(is_constant_foldable(lit));
assert!(is_truthy(lit));         // 7 is truthy
assert!(!could_be_falsy(lit));   // 7 is never falsy

let true_t = TypeBuilder::new().push(TRUE).build();
assert!(is_truthy(true_t));

When you need more than a predicate

Predicates answer single yes/no questions. For:

"Walk the type and collect data" → use inspect.
"Walk the type and produce a new type" → use transform.
"Compare two types" → use the lattice operations.

See also: Inspection: walking the tree for the underlying walker the *_anywhere predicates use; Refinement axes for the truthiness rules; Special elements for the landmark Elements (is_never, is_mixed).

Inspection: walking the tree

The inspect module is the place for deep, short-circuiting boolean queries on a TypeId. Where predicates answer single questions about top-level structure, inspect recurses into every nested-type carrier in the tree and stops the moment the answer is known.

use suffete::inspect;

inspect::any(ty, |elem| /* predicate */);   // true iff at least one Element in the tree satisfies
inspect::all(ty, |elem| /* predicate */);   // true iff every Element in the tree satisfies

The closure is called at every level: top-level union elements, plus every Element nested inside any payload. It is not called twice on the same Element.

What "every nested-type carrier" means

The walker descends through:

Object's type_args and intersections.
List's element_type and known_elements.
Array's key_param, value_param, known_items, intersections.
Iterable's key_type, value_type, intersections.
ObjectShape's known_properties, intersections.
HasMethod's and HasProperty's intersections.
Callable's Signature's parameters, return_type, throws.
ClassLikeString's OfType / Generic constraint.
GenericParameter's constraint.
Reference's type_args and intersections.
Conditional's 4 operands.
Derived's nested TypeIds (all 8 variants).
Negated's inner.
Intersected's head and conjuncts.

A trivial-kind Element (no payload to recurse into) is visited once and the walker moves on.

`any`

pub fn any<F: FnMut(ElementId) -> bool>(ty: TypeId, mut predicate: F) -> bool;

Returns true iff some Element anywhere in ty's tree satisfies predicate. Short-circuits on the first match.

use suffete::ElementKind;

let has_object = inspect::any(ty, |e| matches!(e.kind(), ElementKind::Object | ElementKind::Enum));

`all`

pub fn all<F: FnMut(ElementId) -> bool>(ty: TypeId, mut predicate: F) -> bool;

Returns true iff every Element anywhere in ty's tree satisfies predicate. Short-circuits on the first failure.

all is the negation of any with a flipped predicate:

inspect::all(ty, |e| /* P(e) */) == !inspect::any(ty, |e| !/* P(e) */)

Both signatures are available because all reads more naturally for the "for-every" case.

How the recursion works

The walker is post-order. For each Element it:

Calls the predicate on the Element itself.
If the predicate did not short-circuit, descends into every nested TypeId carrier (per the kind's payload).
For each nested TypeId, recurses with the same predicate.

The descent is implemented as a per-kind dispatch: the walker has a descend_object, a descend_list, etc. The dispatch is exhaustive ; every payload-bearing kind has a descender, and trivial kinds skip directly to the next.

Cost

The cost of inspect::any(ty, p) is bounded by the size of ty's tree, where the tree size is the number of Element occurrences across all nested types. For a flat type like int|string, the tree size is 2. For a deeply nested generic like Map<Foo, list<array{a: int, b: ?Bar}>>, the tree size is in the tens.

The walker does not allocate. It uses the call stack for recursion and a &mut F for the closure.

The most common use of inspect::any is the *_anywhere predicates (predicates) and the equivalent custom predicates analyser code writes. Performance is rarely a concern for this module ; the closure is the cost driver.

A worked example: collecting class names

The walker is for boolean queries; collecting data requires using a RefCell or owning the closure's environment:

use std::cell::RefCell;
use suffete::{ElementKind, inspect, TypeId};
use suffete::interner::interner;
use mago_atom::Atom;

fn collect_class_names(ty: TypeId) -> Vec<Atom> {
    let names = RefCell::new(Vec::new());
    inspect::any(ty, |e| {
        match e.kind() {
            ElementKind::Object => {
                names.borrow_mut().push(interner().get_object(e).name);
            }
            ElementKind::Enum => {
                names.borrow_mut().push(interner().get_enum(e).name);
            }
            _ => {}
        }
        false  // don't short-circuit; visit every Element
    });
    names.into_inner()
}

Returning false from the predicate keeps the walker going; the caller gets the full enumeration.

For a transformation (build a new type), use transform instead.

A worked example: finding a free template

use suffete::{ElementKind, inspect, TypeId};
use suffete::interner::interner;
use mago_atom::{Atom, atom};

fn contains_template_named(ty: TypeId, target: Atom, target_class: Atom) -> bool {
    inspect::any(ty, |e| {
        if e.kind() == ElementKind::GenericParameter {
            let info = interner().get_generic_parameter(e);
            info.name == target  /* and check the defining_entity matches target_class */
        } else {
            false
        }
    })
}

Short-circuits on the first hit. If the type doesn't contain the target template anywhere, the cost is the full tree walk.

The relationship to `predicates`

Several predicates in the predicates module are thin wrappers around inspect::any:

contains_mixed_anywhere(ty) is inspect::any(ty, |e| e.kind() == ElementKind::Mixed).
contains_template_anywhere(ty) is inspect::any(ty, |e| e.kind() == ElementKind::GenericParameter).
contains_placeholder_anywhere(ty) is inspect::any(ty, |e| e.kind() == ElementKind::Placeholder).

Use the predicate when you want the standard variant. Use inspect::any directly when you have a custom condition.

A subtle case: visiting the same Element twice

The walker is non-deduplicating. If two distinct nested positions reach the same ElementId (because the same Element appears in multiple places in the tree), the predicate is called for each occurrence, not just once per ElementId.

let int_t = TypeBuilder::new().push(suffete::prelude::INT).build();
let nested = TypeBuilder::new()
    .push(ElementId::keyed_array(int_t, int_t))   // array<int, int>
    .build();

let mut count = 0;
inspect::any(nested, |_| { count += 1; false });
// count == 3: the array Element + the two int Elements (one for k, one for v)

This is usually what you want; if you need uniqueness, accumulate into a HashSet<ElementId> in the closure's environment.

See also: Predicates for the standard wrappers; Transformation for the dual operation that produces a new type; Element kinds for the exhaustive list of payload-bearing kinds the walker descends into.

Transformation: map, flat_map, filter

The transform module is the place for building a new type from an existing one by applying a closure at every Element position. Where inspect recurses with a bool-returning predicate, transform recurses with an outcome-returning closure that decides whether to keep, replace, expand, or drop each Element.

use suffete::transform;

let result = transform::map(ty, |elem| /* new ElementId */);
let result = transform::flat_map(ty, |elem| /* iterator of ElementId */);
let result = transform::filter_map(ty, |elem| /* Option<ElementId> */);
let result = transform::filter(ty, |elem| /* bool */);

Each entry-point shape is a thin wrapper over the same underlying walker, which is post-order and recurses through every nested-type carrier (the same set as inspect).

The four entry points

`map`

pub fn map<F>(ty: TypeId, f: F) -> TypeId
where F: FnMut(ElementId) -> ElementId;

Apply f at every Element position; replace each in place. The returned type has the same shape as the input but with each Element substituted.

// Replace every `int` with `int<0, ∞>` (positive ints).
let result = transform::map(ty, |elem| {
    if elem == suffete::prelude::INT {
        ElementId::int_range(IntRange::new(Some(0), None))
    } else {
        elem
    }
});

`flat_map`

pub fn flat_map<I, F>(ty: TypeId, f: F) -> TypeId
where I: IntoIterator<Item = ElementId>, F: FnMut(ElementId) -> I;

Apply f at every Element position; expand each into zero or more new Elements. Useful when one Element decomposes into a union (e.g. an integer range split).

// Decompose every `int` into `int<-∞,-1> | int(0) | int<1,∞>`.
let result = transform::flat_map(ty, |elem| {
    if elem == suffete::prelude::INT {
        vec![negative_int, zero, positive_int]
    } else {
        vec![elem]
    }
});

`filter_map`

pub fn filter_map<F>(ty: TypeId, f: F) -> TypeId
where F: FnMut(ElementId) -> Option<ElementId>;

The combination of map and filter: Some(elem) keeps with replacement, None drops.

// Drop every Mixed Element; keep the rest unchanged.
let result = transform::filter_map(ty, |elem| {
    if elem.kind() == ElementKind::Mixed { None } else { Some(elem) }
});

`filter`

pub fn filter<F>(ty: TypeId, f: F) -> TypeId
where F: FnMut(ElementId) -> bool;

Keep an Element iff f returns true. Equivalent to filter_map with if f(elem) { Some(elem) } else { None }, but reads cleaner for the common case.

let result = transform::filter(ty, |elem| elem.kind() != ElementKind::Null);
// result is `ty` with all Null Elements removed

How the walker works

The walker is post-order:

For each Element at the current level, recurse into every nested TypeId carrier (per the kind's payload), transforming each via the same closure.
If any nested TypeId changed, re-intern the Element with the rebuilt payload.
Run the closure on the (possibly rebuilt) Element. The closure decides drop / replace / expand / leave.

Each level commits with a single intern_type call. Nothing is interned redundantly between levels.

Identity short-circuit

If the closure returns the input Element unchanged (or Some(input) for filter_map, or true for filter) at every level, the walker returns the original TypeId without re-interning. This is the common case when the transform happened to leave the type alone.

The closure can express this naturally:

transform::map(ty, |elem| {
    if elem == suffete::prelude::INT { suffete::prelude::INT } else { elem }
});
// Returns ty unchanged; walker detects no leaf changed.

Recursion stops at non-payload types

The walker descends into every payload-bearing kind, but the closure is not called on the inner TypeIds themselves ; the walker substitutes those recursively using the same closure. The closure sees only ElementIds.

For example, array<int, string> is one Element of kind Array with payload KeyedArrayInfo { key_param: TypeId(int), value_param: TypeId(string) }. The walker:

Recurses into TypeId(int), calling the closure on the Int Element.
Recurses into TypeId(string), calling the closure on the String Element.
If either changed, rebuilds the Array Element with the new key/value.
Calls the closure on the Array Element.
The closure decides what to do with the Array Element.

The closure can replace the entire Array Element with something else, or replace just the key parameter (by examining e.kind() and the kind's payload), or leave it alone.

A worked example: substitute a class name

Substitute every reference to class OldName with class NewName throughout a type:

use suffete::{ElementId, ElementKind, transform};
use suffete::interner::interner;
use mago_atom::{atom, Atom};

fn rename_class(ty: TypeId, from: Atom, to: Atom) -> TypeId {
    transform::map(ty, |elem| {
        match elem.kind() {
            ElementKind::Object => {
                let info = interner().get_object(elem);
                if info.name == from {
                    let mut new_info = *info;
                    new_info.name = to;
                    interner().intern_object(new_info)
                } else {
                    elem
                }
            }
            _ => elem,
        }
    })
}

The walker handles the recursion into Object's type_args, and into nested arrays/lists/iterables/etc. The closure only handles the leaf decision.

A worked example: drop nullability

use suffete::{ElementKind, transform};

fn drop_null(ty: TypeId) -> TypeId {
    transform::filter(ty, |elem| elem.kind() != ElementKind::Null)
}

The walker descends into every nested type and applies the filter. A nested ?int (i.e. int|null) becomes int. A nested array<int, ?Foo> becomes array<int, Foo>.

When to use `transform` vs `lattice` vs `template::substitute`

transform when the change is structural and Element-by-Element. Useful for renames, drops, kind-substitutions.
lattice::* when the question is set-theoretic (subtype, intersection, union, difference). The lattice handles the canonicalization rules; transform does not.
template::substitute when the change is parameter-driven: replacing a free GenericParameter with a concrete type. substitute is transform::map specialised to the parameter case.

Cost

Same as inspect: bounded by the tree size. The interner cost dominates when the transform actually changes things ; one intern call per level that has at least one changed Element. The identity short-circuit makes no-op transforms free.

See also: Inspection for the dual (boolean queries); Substitute for the parameter-substitution variant; TypeBuilder for an alternative when you want a flat, non-recursive build.

Hierarchy and World

Suffete is codebase-agnostic. It does not know which classes the user has declared, what they extend, what methods and properties they declare, what template parameters they have, what their declared variances are, or what type aliases the user has defined. All of that knowledge lives in the analyser and is exposed to suffete through one trait: World.

use suffete::world::World;

This chapter covers the trait surface and the conventions analyser implementations should follow.

The trait

World is a query interface. Every method takes a &self and returns a value (or None). Suffete never mutates the world; it only reads.

Sketch of the surface (the actual trait has more methods; this is the most-used subset):

pub trait World: Sized + Sync {
    // Inheritance
    fn descends_from(&self, descendant: Atom, ancestor: Atom) -> bool;
    fn inherited_template_argument(
        &self,
        descendant: Atom,
        ancestor: Atom,
        position: usize,
    ) -> Option<TypeId>;

    // Class properties and methods
    fn class_has_method(&self, class: Atom, method: Atom) -> bool;
    fn class_has_property(&self, class: Atom, property: Atom) -> bool;
    fn class_property_type(&self, class: Atom, property: Atom) -> Option<TypeId>;
    fn class_constant_type(&self, class: Atom, constant: Atom) -> Option<TypeId>;

    // Class-like classification
    fn is_final(&self, class_like: Atom) -> bool;
    fn is_interface(&self, class_like: Atom) -> bool;
    fn is_trait(&self, class_like: Atom) -> bool;
    fn is_enum(&self, class_like: Atom) -> bool;

    // Templates
    fn template_parameter_arity(&self, class_like: Atom) -> usize;
    fn template_parameter_at(&self, class_like: Atom, position: usize)
        -> Option<TemplateParameter>;
    fn template_parameter_index(&self, class_like: Atom, name: Atom)
        -> Option<usize>;

    // Enums
    fn enum_backing(&self, enum_name: Atom) -> Option<EnumBacking>;

    // Aliases
    fn resolve_alias(&self, name: Atom) -> Option<TypeId>;

    // ... and more
}

The exact method list lives in src/world/mod.rs.

What suffete asks the world

The lattice's family rules consult the world wherever it cannot answer from the type alone:

Object family (refines, meet, overlaps):
- descends_from(D, C) for inheritance checks.
- inherited_template_argument(D, C, i) for variance through inheritance.
- template_parameter_at(C, i) for declared variance.
- class_has_method(C, m) for Foo <: has-method<m>.
- class_has_property(C, p) for Foo <: has-property.
- class_property_type(C, p) for shape-vs-named subtyping.
- is_final(C) for finality-aware refines and meet.
Enum family:
- enum_backing(E) for the structural shape of Status::Active's value property.
Class-like-string family:
- descends_from(C, D) for class-string<C> <: class-string<D>.
- is_interface(C), is_enum(C), is_trait(C) for the kind-axis check.
Conditional and Derived expansion (in expand):
- class_constant_type(C, K) for MemberReference.
- resolve_alias(name) for Alias.
- The full set of derived-type queries.

The `NullWorld`

Suffete ships a trivial implementation: NullWorld. It returns:

descends_from: only descendant == ancestor.
class_has_method, class_has_property: always false.
All the others: None or zero or empty.

NullWorld is useful for:

Examples and tests where the analyser's codebase model is irrelevant.
The lattice's join canonicalisation, which uses NullWorld for the structural-only subsumption check (so that int|float is not collapsed to float via PHP runtime coercion).

use suffete::world::NullWorld;
let world = NullWorld;

Implementing `World`

The analyser implements World by reading from its codebase model. A sketch:

use suffete::world::{World, TemplateParameter, EnumBacking, Variance};
use suffete::TypeId;
use mago_atom::Atom;

pub struct AnalyserWorld {
    classes: HashMap<Atom, ClassInfo>,
    aliases: HashMap<Atom, TypeId>,
    // ... other tables ...
}

impl World for AnalyserWorld {
    fn descends_from(&self, d: Atom, a: Atom) -> bool {
        if d == a { return true; }
        let mut current = self.classes.get(&d);
        while let Some(info) = current {
            if info.parents.contains(&a) { return true; }
            current = info.parents.first().and_then(|p| self.classes.get(p));
        }
        false
    }

    fn class_has_method(&self, c: Atom, m: Atom) -> bool {
        self.classes.get(&c)
            .map(|info| info.methods.contains_key(&m))
            .unwrap_or(false)
    }

    fn template_parameter_at(&self, c: Atom, i: usize) -> Option<TemplateParameter> {
        let info = self.classes.get(&c)?;
        info.template_parameters.get(i).copied()
    }

    // ... and so on ...
}

The implementation can use any storage strategy ; HashMap, indexed slices, B-trees, lazy-loading from disk. Suffete only cares about the answers.

Performance contract

Suffete calls World methods frequently during lattice operations on object-family types. A refines(IntList, Iterable<int>) involves at least:

One descends_from (to confirm IntList implements Iterable).
One inherited_template_argument (to get IntList's contribution to Iterable's parameter).
One template_parameter_at (for the declared variance).

For an analyser checking thousands of refines calls per file, a slow World is a bottleneck. The conventions:

descends_from should be O(1) amortised. Pre-compute the transitive closure of inheritance; cache the answer.
class_has_method and class_has_property should be O(1). Use a HashMap.
inherited_template_argument should be O(1) amortised. Pre-compute the inheritance bindings.
template_parameter_at should be O(1). Index by position into a small vector.

The analyser pays the up-front cost when it ingests the codebase; suffete pays the per-query cost on every lattice call.

Threading

The trait requires Sync. Lattice operations are pure functions and can be called from multiple threads simultaneously, all sharing one &World. The analyser's implementation must allow concurrent reads (typical: read-only after ingestion).

A worked example: descendant lookup

use suffete::world::World;
use suffete::{TypeBuilder, ElementId};
use mago_atom::atom;

struct DemoWorld;

impl World for DemoWorld {
    fn descends_from(&self, d: mago_atom::Atom, a: mago_atom::Atom) -> bool {
        // For demo: class B descends from class A; everything else is itself only.
        if d == a { return true; }
        if d.as_str() == "B" && a.as_str() == "A" { return true; }
        false
    }

    // ... other methods stub out to None / false / 0 ...

    fn template_parameter_arity(&self, _: mago_atom::Atom) -> usize { 0 }
    fn template_parameter_at(&self, _: mago_atom::Atom, _: usize)
        -> Option<suffete::world::TemplateParameter> { None }
    fn template_parameter_index(&self, _: mago_atom::Atom, _: mago_atom::Atom)
        -> Option<usize> { None }
    fn inherited_template_argument(&self, _: mago_atom::Atom, _: mago_atom::Atom, _: usize)
        -> Option<suffete::TypeId> { None }
    fn class_has_method(&self, _: mago_atom::Atom, _: mago_atom::Atom) -> bool { false }
    fn class_has_property(&self, _: mago_atom::Atom, _: mago_atom::Atom) -> bool { false }
    fn class_property_type(&self, _: mago_atom::Atom, _: mago_atom::Atom) -> Option<suffete::TypeId> { None }
    fn class_constant_type(&self, _: mago_atom::Atom, _: mago_atom::Atom) -> Option<suffete::TypeId> { None }
    fn is_final(&self, _: mago_atom::Atom) -> bool { false }
    fn is_interface(&self, _: mago_atom::Atom) -> bool { false }
    fn is_trait(&self, _: mago_atom::Atom) -> bool { false }
    fn is_enum(&self, _: mago_atom::Atom) -> bool { false }
    fn enum_backing(&self, _: mago_atom::Atom) -> Option<suffete::world::EnumBacking> { None }
    fn resolve_alias(&self, _: mago_atom::Atom) -> Option<suffete::TypeId> { None }
}

let world = DemoWorld;
let class_a = TypeBuilder::new().push(ElementId::named_object(atom("A"))).build();
let class_b = TypeBuilder::new().push(ElementId::named_object(atom("B"))).build();

let mut report = suffete::lattice::LatticeReport::new();
let opts = suffete::lattice::LatticeOptions::default();

assert!(suffete::lattice::refines(class_b, class_a, &world, opts, &mut report));

Hierarchy module

Suffete ships a hierarchy module with helpers for managing class-hierarchy data structures. It's a convenience for analyser implementations that don't already have one. See src/hierarchy/mod.rs for the API.

The hierarchy module is not a World implementation by itself ; it's a kit you can compose into your World. Most analysers will have their own.

See also: Templates in depth for the template-related world methods; Specialise for the inheritance-binding protocol; Expand for the resolution methods on the world.

Casting and runtime compatibility

Two related but distinct operations live in their own modules:

cast — produces a new type that the source type would coerce to in a given target context.
compatibility — answers a boolean: does the source type, in any of its parts, share runtime compatibility with the target?

Both exist because PHP's runtime has rules that the static lattice doesn't fully capture. int does not refine float in the static sense (int(0) is not a float), but int can be passed to a float parameter (PHP coerces). The cast and compatibility operations let the analyser model those.

`cast`

use suffete::cast;

let result: TypeId = cast::to(input, target_kind, &world, options);

The exact API has multiple cast variants depending on the target:

cast::to_int(ty) — what is the type after (int)$x?
cast::to_float(ty) — what is the type after (float)$x?
cast::to_string(ty) — what is the type after (string)$x?
cast::to_bool(ty) — what is the type after (bool)$x?
cast::to_array(ty) — what is the type after (array)$x?

Each follows PHP's runtime cast semantics. The result is typically a single Element of the target kind, with as much refinement as can be preserved (e.g. casting int(0) to bool gives false; casting int<-∞,-1>|int<1,∞> to bool gives true).

use suffete::{TypeBuilder, prelude::{INT, INT_ZERO}};

let zero = TypeBuilder::new().push(INT_ZERO).build();
let one  = TypeBuilder::new().push(suffete::ElementId::int_literal(1)).build();
let any_int = TypeBuilder::new().push(INT).build();

let zero_bool = cast::to_bool(zero);    // false
let one_bool  = cast::to_bool(one);     // true
let any_bool  = cast::to_bool(any_int); // bool (could be either)

The cast operation is deterministic for refined inputs (a literal cast to a type produces a literal output) and conservative for unrefined inputs (an int cast to bool produces bool, since the analyser can't statically know which way it'll go).

`compatibility`

use suffete::compatibility;

let compatible: bool = compatibility::runtime_compatible(a, b, &world, options);

Asks: is there some pair of runtime values, one in a and one in b, that PHP would consider compatible? The answer is a boolean.

This is not the same as overlaps. overlaps is the static type-set intersection: do a and b share a value in the kind sense? runtime_compatible is broader: does PHP's runtime allow a value of a to be used where a value of b is expected?

Examples where runtime_compatible differs from overlaps:

int and float: overlaps returns false (no integer is a float); runtime_compatible returns true (PHP coerces).
numeric-string and int: overlaps returns false; runtime_compatible returns true (PHP coerces).
int and bool: overlaps returns false; runtime_compatible returns true in non-strict mode (PHP coerces 0 to false, non-zero to true).
Foo and class-string<Foo>: overlaps returns false (one is an object, the other a string); runtime_compatible returns true if the analyser is checking parameter passing where a class-string can produce an instance.

Use cases:

The analyser is checking a function call boundary in non-strict mode and wants to know whether to warn about the argument type.
The analyser is checking a switch statement against case values and wants to know which cases are reachable under PHP's loose comparison.
The analyser is checking an == (loose equality) operator and wants to know whether the comparison can possibly return true.

For strict questions ("could this value, statically, be both?"), use overlaps instead.

How the two relate

The runtime_compatible operation is a superset of overlaps:

overlaps(a, b) → runtime_compatible(a, b). (If they share a value, they're compatible.)
runtime_compatible(a, b) does not imply overlaps(a, b). (Compatibility includes coercion edges.)

refines(a, b) → runtime_compatible(a, b) (assuming a is inhabited).

Lattice options that affect cast / compatibility

The LatticeOptions php_runtime_coerce toggle controls whether the lattice itself admits coercion edges in refines. Cast and compatibility are always coercion-aware ; they exist to model the runtime. The php_runtime_coerce option toggles whether refines also models the runtime (in non-strict mode) or stays strict.

In strict-types mode (declare(strict_types=1)), refines should be called with php_runtime_coerce = false. The cast and compatibility operations are unchanged ; they always reflect runtime behaviour.

A worked example

use suffete::{TypeBuilder, prelude::{INT, FLOAT}, lattice::{self, LatticeOptions, LatticeReport}, world::NullWorld, compatibility, cast};

let world = NullWorld;
let opts = LatticeOptions::default();
let mut rep = LatticeReport::new();

let int_t = TypeBuilder::new().push(INT).build();
let float_t = TypeBuilder::new().push(FLOAT).build();

// Static refines: int does not refine float.
assert!(!lattice::refines(int_t, float_t, &world, opts, &mut rep));

// But runtime allows the coercion.
assert!(compatibility::runtime_compatible(int_t, float_t, &world, opts));

// Casting int to float gives float.
let casted = cast::to_float(int_t);
assert_eq!(casted, float_t);

When to use which

Question	Operation
Is `a` a strict subtype of `b`?	`lattice::refines`
Do `a` and `b` share a value statically?	`lattice::overlaps`
Is `a` runtime-compatible with `b` (including coercions)?	`compatibility::runtime_compatible`
What does `a` coerce to when forced to type `b`?	`cast::to_X`
What is the smallest type containing both `a` and `b`?	`lattice::join`
What is the type of `a`'s and `b`'s shared values?	`lattice::meet`

A subtle case: refines, overlaps, and runtime_compatible disagree

For int and float:

refines(int, float) = false (strictly: an int is not a float).
overlaps(int, float) = false (no integer is a float as a value).
runtime_compatible(int, float) = true (PHP coerces).
meet(int, float) = never (no shared values).
join(int, float) = int|float (no canonical merge).

The strict static analysis (refines, overlaps, meet) treats them as disjoint. The runtime model (runtime_compatible, cast) acknowledges the coercion. The analyser chooses which to consult based on the diagnostic it's producing.

See also: refines, overlaps for the strict static answers; Lattice options and reports for the php_runtime_coerce toggle.

Expansion: resolving unresolved elements

The expand module resolves the unresolved Element kinds — Alias, Reference, MemberReference, GlobalReference, Conditional, Derived, Variable — into structural types the lattice can reason about.

use suffete::expand;

let resolved: TypeId = expand::expand(input, &world, &template_env);

The contract is one direction: the analyser must call expand on a type before passing it to the lattice if the type might contain any unresolved Element. The lattice does not invoke expansion itself; the recursion would loop, and the analyser knows when expansion is safe in a way the lattice cannot.

Why expansion is the analyser's job

Resolving an Alias requires looking the alias name up in the analyser's alias table. Resolving a Conditional requires evaluating the subject vs target subtype check, which requires the world. Resolving a Derived requires walking into the target type and applying the transformation.

All of these are traversals the analyser may want to control: cache the result, abort on a cycle, expand only some kinds (e.g. expand aliases but leave conditionals lazy), produce diagnostics on unresolved names. The lattice would have one fixed strategy; the analyser benefits from having the choice.

What `expand` does

expand(ty, world, template_env) walks the type tree and replaces every unresolved Element with its structural form, recursing into nested types. The high-level rules:

Alias { name } — resolve via world.resolve_alias(name). If the alias is itself an alias chain, follow until structural.
Reference { name, type_args, intersections } — resolve name to a class-like or a template parameter via the world's symbol table. Substitute type_args if applicable.
MemberReference { class, name } — resolve the member name on the class via world.class_constant_type (or similar member-type query).
GlobalReference { name } — resolve via the analyser's global type variable table.
Conditional { subject, target, then, otherwise } — under template_env, check subject <: target. Return then if yes, otherwise if no.
Derived(...) — apply the per-variant transformation.
Variable { id } — look up id in the analyser's inference state.

The template_env argument is the current substitution environment — bindings from template parameters to types. Used by Conditional (to substitute subject and target before the check) and by Derived variants that reference templates.

Worked example: `Alias`

/** @type UserId = positive-int */

function find(UserId $id): User { ... }

The parameter type is parsed as Alias { name: "UserId" }. Before the analyser checks find(7), it expands:

let alias_t = ...; // contains Alias { name: "UserId" }
let resolved = expand::expand(alias_t, &world, &template_env);
// resolved == int<1, ∞>  (the underlying type of UserId)

The world's resolve_alias("UserId") returns the underlying type; expansion substitutes it.

Worked example: `Conditional`

/**
 * @template T
 * @return ($T extends int ? string : bool)
 */
function classify(): mixed { ... }

The return type is Conditional { subject: T, target: int, then: string, otherwise: bool }. After the call site has bound T := int(7):

let template_env = ...; // T → int(7)
let cond_t = ...; // Conditional element wrapped in a TypeId
let resolved = expand::expand(cond_t, &world, &template_env);
// Step 1: substitute template_env into the conditional.
// subject (T) becomes int(7); target stays int.
// Step 2: check int(7) <: int → true.
// Step 3: take the `then` branch.
// resolved == string

If T were bound to bool instead, the target check would fail and the otherwise branch (bool) would be taken.

Worked example: `Derived` (KeyOf)

/** @type Shape = array{a: int, b: string} */
/** @type Keys = key-of<Shape> */

The type Keys is Derived(KeyOf(Alias("Shape"))). Expansion:

Resolve the inner Alias("Shape") → array{a: int, b: string}.
Apply KeyOf to the result: extract the keys → 'a' | 'b'.

The Derived(KeyOf) variant has a per-kind transformation. For keyed arrays, it returns the union of literal-string keys (or array-key if the array is unsealed). For lists, it returns int (or a range of valid indices). For objects, it returns the property names.

Recursion and termination

Aliases can be chains: A = B, B = C, C = int. Expansion follows the chain. Suffete does not detect cycles directly ; the world's resolve_alias is expected to return None for unknown names, which terminates the chain naturally.

If the analyser needs cycle detection, it should detect cycles in its alias table during ingestion and report a diagnostic before suffete sees the type.

Partial expansion

Sometimes the analyser wants to expand some kinds but not others ; expand aliases but defer conditionals until enough template bindings exist, for example. The crate offers per-kind expansion helpers in addition to the catch-all expand::expand:

let alias_only = expand::expand_aliases(input, &world);
let conditionals_only = expand::expand_conditionals(input, &world, &template_env);

The exact list of partial expanders is in src/expand/. The catch-all is the most common entry point.

Cost

Expansion is O(tree size) for the type, plus the per-kind costs:

Alias and Reference resolution: one world query per kind.
Conditional resolution: one refines call (lattice cost) plus the substitution.
Derived resolution: variant-specific, but bounded by the input's tree size.

The most expensive cases are Conditional chains where each branch is itself a Conditional with a non-trivial subject ; the lattice is invoked recursively. Most analyser-level types resolve in microseconds.

Idempotence

expand is idempotent: expand(expand(t)) == expand(t) (assuming no world changes between calls). The analyser can call expand without worrying about double expansion.

Detecting whether expansion is needed

The predicates chapter exposes:

use suffete::predicates::is_fully_resolved;

if !is_fully_resolved(ty) {
    ty = expand::expand(ty, &world, &template_env);
}

This is the recommended pattern for analyser code that consumes types from the type-source layer (the parser, the codebase model, the docblock interpreter) and feeds them to the lattice.

A worked example: full pipeline

use suffete::{TypeId, expand, lattice::{self, LatticeOptions, LatticeReport}};
use suffete::predicates::is_fully_resolved;

fn analyser_check<W: suffete::world::World>(
    input: TypeId,
    expected: TypeId,
    world: &W,
    template_env: &TemplateEnv,
) -> bool {
    let input    = if is_fully_resolved(input)    { input    } else { expand::expand(input, world, template_env) };
    let expected = if is_fully_resolved(expected) { expected } else { expand::expand(expected, world, template_env) };

    let opts = LatticeOptions::default();
    let mut report = LatticeReport::new();

    lattice::refines(input, expected, world, opts, &mut report)
}

The pattern: expand each side if needed, then call the lattice. Trivial wrapper, but the right interface to enforce on every analyser-side query.

See also: Unresolved elements for the kinds expansion handles; Predicates for is_fully_resolved; World for the resolution methods used during expansion; Conditional and Derived for the per-variant rules.

Serialization

The serialize module provides serde implementations for TypeId, ElementId, and the surrounding handle types. The intended use case: persisting types across analyser runs (caching), exchanging types over a wire (LSP, RPC), or producing a human-readable rendering for diagnostics.

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct Diagnostic {
    pub message: String,
    pub expected: suffete::TypeId,
    pub actual:   suffete::TypeId,
}

TypeId and ElementId implement Serialize and Deserialize ; the same is true of the support handles (ElementListId, TypeListId, etc.).

Wire format

The wire format serialises the content of the handle, not the bit pattern. Two reasons:

Cross-process stability. The bit pattern of a TypeId is a process-local arena slot ; it has no meaning in another process. The content (the element kinds, the payloads, the nested types) is portable.
Compactness for human-readable output. Producing a JSON like {"kind": "Object", "name": "Foo", "args": [{"kind": "Int"}]} is more useful in a diagnostic than {"slot": 1234, "flags": 0, "meta": 0}.

The exact format is suffete-defined; consumers should not rely on the specific shape, only on round-trip safety: serialise + deserialise produces an equivalent handle (same == after re-interning).

Round-trip safety

let original: TypeId = ...;
let json = serde_json::to_string(&original).unwrap();
let recovered: TypeId = serde_json::from_str(&json).unwrap();
assert_eq!(original, recovered);

Deserialising the JSON re-interns the type into the current process's arenas. The recovered TypeId may have a different bit pattern from the original (different process, different arena state) but compares equal via the content-based Eq.

Serialising types referencing the world

Some types reference names the world owns: class names, enum names, alias names, template parameter names. Serialisation includes the names. Deserialisation interns the names through the standard mago_atom::atom! machinery; the recovered type carries the same names.

This means: a deserialised type can be passed to a World of a different analyser run, and the world's queries on those names will work as expected (assuming the codebase still has the same names).

What is not serialised

The meta byte of TypeId. The meta is consumer-defined; serialisation includes it, but suffete does not know what it means.
Internal arena slot indices. They are process-local.
Any caches or precomputed state.

Performance

Serialisation walks the type tree and emits a structured representation. The cost is O(tree size). For an analyser caching thousands of types, the serialise-side cost is the dominant one (deserialise re-interns into existing arenas, which is fast).

For very large collections, bincode or postcard are faster than serde_json ; both are supported via serde's standard mechanisms.

A worked example

use suffete::{TypeBuilder, prelude::{INT, STRING}};

let original = TypeBuilder::new().push(INT).push(STRING).build();

// Serialise to JSON.
let json = serde_json::to_string_pretty(&original).unwrap();
println!("{}", json);

// Deserialise back.
let recovered: suffete::TypeId = serde_json::from_str(&json).unwrap();
assert_eq!(original, recovered);

Using serialize for diagnostics

Diagnostics typically want a human-readable rendering of the type, not the wire format. For that, use the Display implementation:

let pretty: String = format!("{}", original.as_ref());   // "int|string"

Or the Typed trait's pretty_with_indent method for multi-line, indented output suitable for hover-style displays.

The serialize module is for storage and transport. The Display / Typed::pretty methods are for diagnostics.

See also: Handles for the underlying types being serialised; the Typed trait for diagnostic-quality rendering.

Answering "is A a subtype of B?"

The most common analyser question. Recipe with the full setup, expansion, and result-handling.

The pattern

use suffete::{TypeId, lattice::{self, LatticeOptions, LatticeReport}, predicates::is_fully_resolved, expand};
use suffete::world::World;

fn is_subtype<W: World>(
    sub:    TypeId,
    sup:    TypeId,
    world:  &W,
    template_env: &TemplateEnv,
    strict: bool,
) -> (bool, LatticeReport) {
    // 1. Expand any unresolved Elements (Alias, Conditional, Derived, ...).
    let sub = if is_fully_resolved(sub) { sub } else { expand::expand(sub, world, template_env) };
    let sup = if is_fully_resolved(sup) { sup } else { expand::expand(sup, world, template_env) };

    // 2. Configure lattice options.
    let opts = LatticeOptions::default()
        .with_php_runtime_coerce(!strict);

    // 3. Run the lattice query, collecting report side-info.
    let mut report = LatticeReport::new();
    let result = lattice::refines(sub, sup, world, opts, &mut report);

    (result, report)
}

Reading the result

The (bool, LatticeReport) pair is structured for diagnostic emission:

bool == true ; the subtype check holds (under the chosen strictness).
bool == false ; the check failed; the analyser surfaces a type error.
report.causes ; bitset of which coercion-tolerant rules fired. Surface in the diagnostic if you want to warn (e.g. "this passes, but uses PHP's int-to-float coercion").

Strictness

The strict parameter maps to PHP's declare(strict_types=1):

strict mode (strict = true): no coercion edges. int <: float is false. numeric-string <: int is false.
non-strict mode (strict = false, the default): coercion edges admitted with CoercionCauses::PHP_RUNTIME_COERCE recorded.

The analyser typically chooses based on the file's declare(strict_types) directive. Files without a directive default to non-strict.

Worked examples

Direct subtype

use suffete::{TypeBuilder, prelude::{INT, STRING}};

let int_t  = TypeBuilder::new().push(INT).build();
let int_or_str = TypeBuilder::new().push(INT).push(STRING).build();

let (ok, _report) = is_subtype(int_t, int_or_str, &world, &template_env, true);
assert!(ok);

Coercion-only subtype (non-strict)

use suffete::{TypeBuilder, prelude::{INT, FLOAT}};

let int_t = TypeBuilder::new().push(INT).build();
let float_t = TypeBuilder::new().push(FLOAT).build();

let (ok_strict, _) = is_subtype(int_t, float_t, &world, &template_env, true);
assert!(!ok_strict);  // strict: int does not refine float.

let (ok_loose, report) = is_subtype(int_t, float_t, &world, &template_env, false);
assert!(ok_loose);    // non-strict: int coerces.
assert!(report.causes.contains(suffete::lattice::CoercionCauses::PHP_RUNTIME_COERCE));

Class-hierarchy subtype

use suffete::{TypeBuilder, ElementId};
use mago_atom::atom;

// Setup (in your World): class B extends A.

let b_t = TypeBuilder::new().push(ElementId::named_object(atom("B"))).build();
let a_t = TypeBuilder::new().push(ElementId::named_object(atom("A"))).build();

let (ok, _) = is_subtype(b_t, a_t, &world, &template_env, true);
assert!(ok);

Generic subtype with variance

// World registers Iterator<T> as covariant on T.

let int_iter = TypeBuilder::new().push(
    ElementId::named_object_with_args(atom("Iterator"), &[suffete::prelude::TYPE_INT])
).build();
let mixed_iter = TypeBuilder::new().push(
    ElementId::named_object_with_args(atom("Iterator"), &[suffete::prelude::TYPE_MIXED])
).build();

let (ok, _) = is_subtype(int_iter, mixed_iter, &world, &template_env, true);
assert!(ok);  // covariance: int <: mixed

Failure with a useful report

let foo = TypeBuilder::new().push(ElementId::named_object(atom("Foo"))).build();
let bar = TypeBuilder::new().push(ElementId::named_object(atom("Bar"))).build();

let (ok, report) = is_subtype(foo, bar, &world, &template_env, true);
assert!(!ok);

// report contains structured information you can use to construct
// a diagnostic message, like which family of rules failed.

Performance

refines is the hot path. Suffete optimises for the common cases:

Reflexivity: refines(t, t) = true in one comparison.
Singleton-vs-singleton: most analyser queries are between one-Element types; the cartesian is degenerate.
Subsumption shortcut in unions: if any Element on the left refines any Element on the right by the universal axioms (top, bot, placeholder), short-circuit.

Expected cost for a typical analyser query: tens of nanoseconds. Worst-case (deep generics, multi-conjunct intersections, full fan-out coverage): microseconds.

When to expand

Expand exactly once per type, before the first lattice query. The expansion result is itself a TypeId and can be cached per (input, world-version, template_env) if the analyser sees it many times.

The is_fully_resolved check is cheap (a tree walk with a short-circuiting predicate), so the recipe above always-checks-then-expands. For analyser code that knows expansion has already happened (e.g. inside a single statement's analysis), skip the check.

When to skip the world

If the question doesn't require the world (e.g. is_subtype(int_or_string, int)), you can pass NullWorld:

use suffete::world::NullWorld;
let (ok, _) = is_subtype(int_or_string, int_t, &NullWorld, &TemplateEnv::default(), true);

The lattice queries the world only when needed. For fully-structural inputs, the world is never asked.

See also: refines for the rules; overlaps for the symmetric "share a value?" question; LatticeOptions for the strictness toggles.

Building a union from scratch

Three ways to construct a union type, depending on what the analyser is doing.

Method 1: literal construction with TypeBuilder

For when the analyser knows the elements ahead of time (a parsed PHP type expression, a hand-coded fixture):

use suffete::{TypeBuilder, prelude::{INT, STRING, NULL}};

let int_or_string_or_null = TypeBuilder::new()
    .push(INT)
    .push(STRING)
    .push(NULL)
    .build();

build is structural: it sorts and dedups but does not apply join's canonicalisation rules (no range merging, no literal collapse, no subtype absorption).

If the analyser wants those rules applied:

let canonical = TypeBuilder::new()
    .push(INT)
    .push(STRING)
    .push(NULL)
    .build_canonical();

build_canonical runs the join's canonicalisation pass: subsumption, range merging, literal collapse, etc. Use it when the analyser wants the smallest representation.

Method 2: incremental, via lattice::join

For when the analyser is producing a union from a set of computed values (e.g. the union of a switch's case values, or the join of a control-flow merge):

use suffete::{TypeId, lattice};

fn join_all<W: suffete::world::World>(types: &[TypeId], world: &W) -> TypeId {
    let mut result = suffete::prelude::TYPE_NEVER;
    let opts = lattice::LatticeOptions::default();
    let mut report = lattice::LatticeReport::new();
    for &t in types {
        result = lattice::join(result, t, world, opts, &mut report);
    }
    result
}

The fold starts at TYPE_NEVER (the join identity) and applies join pairwise. The result is the canonical union of every input.

Each join call applies the canonicalisation rules: range merging, literal collapse, etc.

Method 3: from_type for incremental modification

For when the analyser already has a union and wants to push one or two more elements:

use suffete::{TypeBuilder, prelude::NULL};

let nullable = TypeBuilder::from_type(existing_type)
    .push(NULL)
    .build();

from_type enables the origin short-circuit: if the buffer is unchanged after the mutations, build() returns the original TypeId without re-interning. Useful when the mutations might be no-ops.

Comparison of the three methods

Method	Use when	Canonicalisation	Cost
`TypeBuilder::push * / build()`	Constructing from known elements	Structural only (sort, dedup)	Sort + intern lookup
`TypeBuilder::push * / build_canonical()`	Constructing from known elements, want canonical form	Full canonicalisation	Sort + canonicalise + intern
Repeated `lattice::join`	Folding a sequence	Per-step canonicalisation	N intern + N canonicalise
`TypeBuilder::from_type / push / build()`	Adding to an existing type	Structural only	Origin check + sort + intern

For most analyser code, build (or build_canonical if you need the canonical form) is the right choice. The repeated-join pattern is for pipelines where each input is computed lazily.

Worked example: union of switch case types

function f(int $kind): string {
    return match ($kind) {
        1, 2 => 'a',
        3 => 'b',
        default => 'c',
    };
}

The analyser computes the type of each match-arm body and joins them to produce the function's return type:

let arm1 = ...; // type of 'a' = literal "a"
let arm2 = ...; // type of 'b' = literal "b"
let arm3 = ...; // type of 'c' = literal "c"

let return_t = join_all(&[arm1, arm2, arm3], &world);
// return_t == "a"|"b"|"c"  (or, after literal collapse threshold, just `string`)

If the analyser wants to preserve the literals (no collapse), use TypeBuilder::push * / build. If the analyser wants the canonical form (let the lattice decide whether to collapse), use lattice::join (which uses the lattice's literal-collapse threshold) or build_canonical.

Worked example: making a type nullable

use suffete::{TypeBuilder, prelude::NULL, predicates::contains_null};

fn make_nullable(t: TypeId) -> TypeId {
    if contains_null(t) {
        return t;  // already nullable
    }
    TypeBuilder::from_type(t).push(NULL).build()
}

The contains_null check avoids the unnecessary intern lookup for already-nullable inputs. The from_type ensures the origin short-circuit fires when (paradoxically) the type was already nullable but we hit this code anyway.

Worked example: union of class names

The analyser has a list of classes the user declared in a docblock and wants to construct the type:

use suffete::{TypeBuilder, ElementId};
use mago_atom::Atom;

fn classes_to_union(names: &[Atom]) -> TypeId {
    let mut b = TypeBuilder::new();
    for &name in names {
        b.push(ElementId::named_object(name));
    }
    b.build()
}

let t = classes_to_union(&[
    mago_atom::atom("Foo"),
    mago_atom::atom("Bar"),
    mago_atom::atom("Baz"),
]);
// t == Foo | Bar | Baz

If two of the classes are related by inheritance, build_canonical (or a separate lattice::join fold) would absorb the descendant ; build keeps them distinct.

Performance notes

TypeBuilder::push is O(1).
build is O(n log n) for the sort plus the interner cost. The interner has a fast path that detects already-sorted-and-unique input via SIMD.
build_canonical is O(n²) worst case (subsumption is pairwise), but typical inputs are small.
lattice::join is the canonicalisation cost per pair, O(N) over the fold.

For analyser hot loops constructing many unions, consider:

Reusing one TypeBuilder instance across iterations (clear and re-push).
Pre-sorting if you know the input is canonical, so the interner's fast path fires.
Choosing build over build_canonical if you don't need the canonical form ; the latter is strictly more expensive.

See also: TypeBuilder for the construction API; join for the canonicalisation rules; Predicates for the contains_null and friends used as fast pre-checks.

Narrowing a parameter type from instanceof

A canonical analyser pattern: the parameter has type T, and the body has an instanceof check. Inside the if, the type is narrowed to the intersection; in the else, the type is narrowed to the difference.

The PHP

function f(Foo|Bar|null $x): void {
    if ($x instanceof Foo) {
        // $x is Foo here
    } else {
        // $x is Bar|null here
    }
}

The recipe

use suffete::{TypeId, lattice};
use suffete::world::World;

struct Branches { in_branch: TypeId, else_branch: TypeId }

fn narrow_instanceof<W: World>(
    input: TypeId,         // the parameter's type
    target: TypeId,        // the type from the instanceof's RHS, e.g. Foo
    world: &W,
) -> Branches {
    let opts = lattice::LatticeOptions::default();
    let mut report = lattice::LatticeReport::new();

    let in_branch = lattice::narrow(input, target, world, opts, &mut report);
    let else_branch = lattice::subtract(input, target, world, opts, &mut report);

    Branches { in_branch, else_branch }
}

The narrow for the positive case typically reduces to meet (input ⊓ target). The subtract for the negative case gives back the input minus the target.

Worked example

use suffete::{TypeBuilder, ElementId, prelude::NULL};
use suffete::world::NullWorld;
use mago_atom::atom;

let foo = ElementId::named_object(atom("Foo"));
let bar = ElementId::named_object(atom("Bar"));

let input = TypeBuilder::new().push(foo).push(bar).push(NULL).build();
let target = TypeBuilder::new().push(foo).build();

let world = NullWorld;
let branches = narrow_instanceof(input, target, &world);

// branches.in_branch == Foo
// branches.else_branch == Bar | null

When `target` is itself a union

if ($x instanceof Foo || $x instanceof Bar) {
    // $x is Foo|Bar here
}

The analyser computes the union of the two instanceof targets and passes it as the target. The recipe is unchanged ; narrow and subtract handle multi-element targets.

When `instanceof` is on a generic class

function f(mixed $x): void {
    if ($x instanceof Iterator) {
        // $x is Iterator<mixed, mixed> here (the parameter's upper bounds)
    }
}

The analyser constructs the target as Iterator<mixed, mixed> (or with whatever bounds the codebase declares). The narrowing handles this through the world's parameter declaration: each parameter is filled with its upper bound, and the from_template_default flag is set so the variance check is permissive.

let target = TypeBuilder::new().push(
    ElementId::named_object_with_args(
        atom("Iterator"),
        &[suffete::prelude::TYPE_MIXED, suffete::prelude::TYPE_MIXED],
    )
).build();

let branches = narrow_instanceof(input, target, &world);
// branches.in_branch == Iterator<mixed, mixed>
// (with the from_template_default flag set on the args)

When the input has no overlap with the target

function f(string $x): void {
    if ($x instanceof Foo) {
        // unreachable; $x is `never` here
    }
}

let input = TypeBuilder::new().push(suffete::prelude::STRING).build();
let target = TypeBuilder::new().push(ElementId::named_object(atom("Foo"))).build();
let branches = narrow_instanceof(input, target, &world);
// branches.in_branch == TYPE_NEVER
// branches.else_branch == string

The analyser sees in_branch == TYPE_NEVER and can surface a diagnostic ("this branch is unreachable").

When the input fully refines the target

function f(Foo $x): void {
    if ($x instanceof Foo) {
        // always true; no narrowing needed
    } else {
        // unreachable; $x is `never` here
    }
}

let input = TypeBuilder::new().push(foo).build();
let target = TypeBuilder::new().push(foo).build();
let branches = narrow_instanceof(input, target, &world);
// branches.in_branch == Foo (unchanged)
// branches.else_branch == TYPE_NEVER

The analyser can warn ("this instanceof is always true") or use the result silently.

Negation pattern

function f(Foo|Bar|null $x): void {
    if (!($x instanceof Foo)) {
        // $x is Bar|null here
    } else {
        // $x is Foo here
    }
}

The analyser computes the same Branches and swaps which branch corresponds to which:

let branches = narrow_instanceof(input, target, &world);
// In the !instanceof branch (the if-body): branches.else_branch
// In the else branch: branches.in_branch

Combining with other narrowings

The analyser can chain narrowings as control flow proceeds:

function f(Foo|Bar|null $x): void {
    if ($x === null) { return; }     // narrow: subtract null
    if ($x instanceof Foo) { ... }   // narrow: meet with Foo
}

After the null check, the analyser narrows $x to Foo|Bar. Inside the instanceof Foo branch, narrows again to Foo. Each step is a standalone narrow or subtract call.

Performance

narrow and subtract are both lattice operations. Their cost is bounded by the input sizes; for typical analyser inputs (small unions, no fan-out), each call is sub-microsecond.

The analyser typically runs many narrowings during a single function body's analysis. Reuse the LatticeOptions and LatticeReport if possible; both are cheap to construct, but reusing avoids the Default cost in tight loops.

See also: narrow, subtract, meet for the underlying operations; predicates::is_never for detecting unreachable branches in the result.

Resolving a generic class against an instance

The analyser sees a method call on a generic class. The class is Box<T>; the method declares function get(): T; the instance is Box<int>. What is the return type of $box->get() ?

This is the canonical generic-resolution recipe.

The PHP

/**
 * @template T
 */
class Box<T> {
    /** @return T */
    public function get(): mixed { ... }
}

$box = new Box<int>();
$x = $box->get();   // type of $x is int

The recipe

use suffete::{TypeId, ElementId, ElementKind, template::substitute};
use suffete::element::payload::{ObjectInfo, GenericParameterInfo, DefiningEntity};
use suffete::interner::interner;
use suffete::world::World;
use mago_atom::Atom;

fn resolve_method_return<W: World>(
    instance_type: TypeId,        // Box<int>
    method_name: Atom,            // "get"
    declared_return: TypeId,      // T (a free GenericParameter)
    world: &W,
) -> TypeId {
    // 1. Extract the class name and arguments from the instance.
    let i = interner();
    let view = instance_type.as_ref();
    let elements = view.elements;
    if elements.len() != 1 || elements[0].kind() != ElementKind::Object {
        // not a single-class type; analyser fallback (e.g. union of resolutions)
        return declared_return;
    }

    let obj_info: &ObjectInfo = i.get_object(elements[0]);
    let class_name = obj_info.name;
    let class_entity = i.intern_defining_entity(DefiningEntity::ClassLike(class_name));
    let actual_args = obj_info.type_args
        .map(|id| i.get_type_list(id).to_vec())
        .unwrap_or_default();

    // 2. Substitute the class's template parameters with the instance's args.
    substitute(declared_return, &|info: &GenericParameterInfo| -> Option<TypeId> {
        if info.defining_entity != class_entity {
            return None;
        }
        let position = world.template_parameter_index(class_name, info.name)?;
        actual_args.get(position).copied()
    })
}

How it works

Pull the class name and actual arguments out of the instance type. (For a multi-class union, the analyser would resolve each branch and join.)
Walk the declared return type, replacing every GenericParameter Element whose defining_entity matches the class with the corresponding actual argument.
The walker handles deeply-nested uses of the parameter (e.g. array<T, T> with T := int becomes array<int, int>).

Worked example

use suffete::{TypeBuilder, ElementId};
use suffete::element::payload::{GenericParameterInfo, DefiningEntity};
use suffete::interner::interner;
use mago_atom::atom;

let i = interner();
let box_class = i.intern_defining_entity(DefiningEntity::ClassLike(atom("Box")));

// Declared field type: T
let t_param: ElementId = i.intern_generic_parameter(GenericParameterInfo {
    name: atom("T"),
    defining_entity: box_class,
    constraint: suffete::prelude::TYPE_MIXED,
    qualifier: None,
});
let declared_return = TypeBuilder::new().push(t_param).build();

// Instance: Box<int>
let instance = TypeBuilder::new().push(
    ElementId::named_object_with_args(atom("Box"), &[suffete::prelude::TYPE_INT])
).build();

// Resolve.
let resolved = resolve_method_return(instance, atom("get"), declared_return, &world);
// resolved == int

Multi-parameter case

/**
 * @template K
 * @template V
 */
class Map {
    /**
     * @param K $key
     * @return V
     */
    public function get($key) { /* ... */ }
}

The recipe walks the entire declared return type. For Map<string, int>::get, the declared return is V, which substitutes to int. For a hypothetical Map<string, int>::keys() declaring array<K>, the return would substitute to array<string>.

The substitution is per-parameter; the position lookup via template_parameter_index handles each.

Default-filled parameters

If the user wrote Box instead of Box<int>, the analyser fills T with the upper bound (mixed by default) and stamps the type with from_template_default. The recipe produces mixed as the resolved return; the variance check at the call site tolerates the default-fill (recording CoercionCauses::TEMPLATE_DEFAULT on the report).

Multi-class union

function f(Box<int>|Bag<string> $bw): mixed {
    return $bw->get();
}

The analyser resolves each branch separately and joins:

fn resolve_method_on_union<W: World>(
    instance_type: TypeId,
    method_name: Atom,
    world: &W,
) -> TypeId {
    let mut result = suffete::prelude::TYPE_NEVER;
    let opts = LatticeOptions::default();
    let mut report = LatticeReport::new();
    for &elem in instance_type.as_ref().elements {
        // Look up the method's declared return on this Element's class.
        let declared = world.method_return_type(elem, method_name);  // hypothetical
        let resolved = resolve_method_return(
            TypeBuilder::new().push(elem).build(),
            method_name,
            declared,
            world,
        );
        result = lattice::join(result, resolved, world, opts, &mut report);
    }
    result
}

The join folds the per-branch resolutions into the final return type.

Inheritance

If class IntBox extends Box<int> and the analyser sees (new IntBox())->get():

The recipe is unchanged ; the instance type is IntBox, not Box. To get the declared return for get (which lives on Box), the analyser looks up the method on the world ; the world walks the inheritance and returns Box::get's declared return.

For substitution, the parameter is T on Box (not T on IntBox). The recipe's class_entity should be Box, not IntBox. The actual argument is found via inherited_template_argument(IntBox, Box, 0) = int.

A more robust recipe handles this:

fn resolve_method_return_with_inheritance<W: World>(
    instance_type: TypeId,
    method_name: Atom,
    declared_return: TypeId,
    declaring_class: Atom,        // Box (where get is declared)
    world: &W,
) -> TypeId {
    let i = interner();
    let view = instance_type.as_ref();
    let elements = view.elements;
    if elements.len() != 1 || elements[0].kind() != ElementKind::Object {
        return declared_return;
    }

    let obj_info: &ObjectInfo = i.get_object(elements[0]);
    let instance_class = obj_info.name;
    let actual_args = obj_info.type_args
        .map(|id| i.get_type_list(id).to_vec())
        .unwrap_or_default();

    // For each parameter declared on the declaring class, resolve via inheritance.
    let arity = world.template_parameter_arity(declaring_class);
    let mut bindings: Vec<Option<TypeId>> = Vec::with_capacity(arity);
    for pos in 0..arity {
        // The declaring class's parameter at position pos.
        // Find what the instance class supplies.
        if instance_class == declaring_class {
            bindings.push(actual_args.get(pos).copied());
        } else {
            let inherited = world.inherited_template_argument(instance_class, declaring_class, pos);
            // The inherited type may itself reference the instance class's parameters;
            // substitute those with actual_args.
            bindings.push(inherited.map(|t| substitute_class_params(t, instance_class, &actual_args, world)));
        }
    }

    let declaring_class_entity = i.intern_defining_entity(DefiningEntity::ClassLike(declaring_class));

    substitute(declared_return, &|info: &GenericParameterInfo| -> Option<TypeId> {
        if info.defining_entity != declaring_class_entity {
            return None;
        }
        let position = world.template_parameter_index(declaring_class, info.name)?;
        bindings.get(position).copied().flatten()
    })
}

fn substitute_class_params<W: World>(
    ty: TypeId,
    instance_class: Atom,
    actual_args: &[TypeId],
    world: &W,
) -> TypeId {
    let i = interner();
    let instance_entity = i.intern_defining_entity(DefiningEntity::ClassLike(instance_class));
    substitute(ty, &|info: &GenericParameterInfo| -> Option<TypeId> {
        if info.defining_entity != instance_entity {
            return None;
        }
        let position = world.template_parameter_index(instance_class, info.name)?;
        actual_args.get(position).copied()
    })
}

The lattice's specialise chapter is the formal protocol behind this recipe. The recipe above is what the analyser does to get the resolved return type.

Performance

The substitution walker is O(tree size) per call. The world queries (template_parameter_index, inherited_template_argument) are typically O(1) amortised in a well-implemented analyser. Total cost per method-call resolution is sub-microsecond for typical analyser inputs.

See also: Substitute for the substitution operation; Specialise for the inheritance protocol; Templates in depth for the parameter-Element details.

Walking a type to find every class name

The analyser wants to extract every class-like name a type references — for refactoring, for symbol tracking, for diagnostics. The recipe uses inspect with a closure that collects into a RefCell.

The recipe

use std::cell::RefCell;
use suffete::{TypeId, ElementKind, inspect};
use suffete::interner::interner;
use mago_atom::Atom;

fn collect_class_names(ty: TypeId) -> Vec<Atom> {
    let names = RefCell::new(Vec::new());
    inspect::any(ty, |elem| {
        let i = interner();
        match elem.kind() {
            ElementKind::Object => {
                names.borrow_mut().push(i.get_object(elem).name);
            }
            ElementKind::Enum => {
                names.borrow_mut().push(i.get_enum(elem).name);
            }
            ElementKind::ClassLikeString => {
                use suffete::element::payload::ClassLikeStringSpecifier;
                let info = i.get_class_like_string(elem);
                if let ClassLikeStringSpecifier::Literal(name) = info.specifier {
                    names.borrow_mut().push(name);
                }
            }
            _ => {}
        }
        false  // don't short-circuit; visit every Element
    });
    names.into_inner()
}

Returning false from the predicate keeps the walker going. If you want to short-circuit on the first hit (e.g. "does this type reference any class?"), return true after a match.

How `inspect::any` enumerates

The walker is post-order and recurses into every nested-type carrier (inspect chapter has the full list). For each Element:

Recurse into nested types (object args, list element, callable params, etc.).
Call the closure on this Element.
If the closure returns true, short-circuit; otherwise continue.

A type like Box<Map<string, Foo|Bar>> | null produces this enumeration order (post-order):

string, Foo, Bar, Map, Box, null

The recipe above pushes only the class-like Elements (Foo, Bar, Map, Box), giving [Foo, Bar, Map, Box].

Worked example

use suffete::{TypeBuilder, ElementId};
use mago_atom::atom;

let inner = TypeBuilder::new()
    .push(ElementId::named_object(atom("Foo")))
    .push(ElementId::named_object(atom("Bar")))
    .build();

let map_t = TypeBuilder::new().push(
    ElementId::named_object_with_args(atom("Map"), &[suffete::prelude::TYPE_STRING, inner])
).build();

let outer = TypeBuilder::new().push(
    ElementId::named_object_with_args(atom("Box"), &[map_t])
).push(suffete::prelude::NULL).build();

let names = collect_class_names(outer);
// names == [Foo, Bar, Map, Box]  (in some order)

Deduplicating

The walker visits the same ElementId once per occurrence, not per ID. If a class appears multiple times in the tree, the closure runs multiple times. To dedupe:

use std::collections::HashSet;

fn collect_unique_class_names(ty: TypeId) -> HashSet<Atom> {
    let names = RefCell::new(HashSet::new());
    inspect::any(ty, |elem| {
        let i = interner();
        match elem.kind() {
            ElementKind::Object => { names.borrow_mut().insert(i.get_object(elem).name); }
            ElementKind::Enum   => { names.borrow_mut().insert(i.get_enum(elem).name); }
            _ => {}
        }
        false
    });
    names.into_inner()
}

HashSet::insert does the dedup; the closure runs as many times as there are occurrences, but only unique names are stored.

Reach into intersection conjuncts

The walker recurses into Object's intersections automatically. For Foo & Bar, the recipe collects both Foo and Bar. For deeply nested intersections (e.g. Foo & Bar & Baz<Qux>), the recipe collects all four.

Filtering by `class_like_kind`

If you want only class names (not interfaces, traits, enums), the recipe can filter:

fn collect_class_only<W: World>(ty: TypeId, world: &W) -> Vec<Atom> {
    let names = RefCell::new(Vec::new());
    inspect::any(ty, |elem| {
        if elem.kind() == ElementKind::Object {
            let name = interner().get_object(elem).name;
            // Filter: only keep classes, exclude interfaces/traits.
            if !world.is_interface(name) && !world.is_trait(name) {
                names.borrow_mut().push(name);
            }
        }
        false
    });
    names.into_inner()
}

The world query keeps the recipe codebase-aware.

Performance

inspect::any is O(tree size) when the closure never short-circuits. For typical analyser types (tree size of a few dozen Elements), the walk is sub-microsecond.

The closure cost is the dominant factor. The recipes above do one interner().get_object(elem) per Object Element, which is one &'static arena read per element ; sub-nanosecond.

Allocating the result vector or hashset has its own cost; for hot paths, reuse a pre-allocated buffer.

A subtle case: ClassLikeString specifier

ClassLikeString Elements may carry their class name in the Literal(name) specifier, in the OfType { constraint } (a generic constraint), or in the Generic { constraint } (a ::class lookup on a generic). The recipe above handles only the Literal case.

For OfType and Generic, the constraint is a TypeId ; the walker descends into it automatically and the closure sees any Object Elements inside the constraint.

Variant: collect class names with positions

If the analyser wants to know where in the type each class name appeared (e.g. for diagnostic spans), the closure can track context using a stack in its environment:

let names_with_positions = RefCell::new(Vec::new());
let position = RefCell::new(0u32);
inspect::any(ty, |elem| {
    *position.borrow_mut() += 1;
    if elem.kind() == ElementKind::Object {
        let name = interner().get_object(elem).name;
        names_with_positions.borrow_mut().push((name, *position.borrow()));
    }
    false
});

The position counter increments on every visit, giving each class a stable position in the walk.

See also: Inspection for the walker; Element kinds for the full list of payload-bearing kinds and their nested-type carriers; World for codebase-aware filtering.

Interning and the arenas

Every interesting value in suffete — an Element, a Type, an element list, a known-items list, a defining entity, a callable signature — is interned. This chapter covers how interning works and why suffete leans on it for both correctness and performance.

This is internals reading. The public API works without understanding it. But if you're optimising analyser-side code or debugging an arena-related issue, this chapter is the one.

What interning gives you

Two ElementIds with the same logical content are the same handle. Equality is u32 == u32. Hashing is u32 as u64. Two parts of a process that construct the same Element independently end up with the same handle and never know they raced.

Three consequences:

Cheap equality. No structural compare, no recursion, no hash chain walk. One CPU compare.
Stable identity. A TypeId constructed today is the same handle as one constructed tomorrow (same process). Useful for caching.
Compact representation. Every Element is one u32. A TypeId is one u64. A union of n Elements is one u64 plus a borrowed slice of n u32s.

What interning costs

Interning has a fixed memory cost: the dedup table. Suffete uses dashmap::DashMap for the dedup, which is concurrent-friendly but not zero-cost. The lookup is a hash of the value plus a bucket walk.

For a content-keyed value, the hash is over the content (the kind tag plus the payload). For a slice-keyed value, the hash is over the slice contents.

The intern operation:

Hash the input.
Probe the dashmap.
On a hit, return the existing slot.
On a miss, allocate the value, push it into the per-kind arena, return the new slot.

Steps 2-3 dominate: the hash is fast, the dashmap lookup is the cost.

The arena types

Two basic shapes:

Arena<T> — interns whole values of T. One slot per unique T. Used for: per-element-kind payloads, defining entities, callable signatures, known-items lists, etc.
SliceArena<T> — interns slices of T. One slot per unique slice content. The slice is leaked to 'static on first sight. Used for: the union body slice, object type-args, derived type lists, known-property lists, etc.

Both store underlying values in a boxcar::Vec (a thread-safe append-only vector). Both expose 1-based slot indices ; slot 0 is reserved as the NonZero{U32,U64} niche.

The interner singleton

There is exactly one interner per process. It contains every arena, lives behind a OnceLock for the process lifetime, and is Sync ; concurrent use from multiple threads is supported and uncontended in the common case (the dashmap shards by hash).

Per-element-kind arenas

Each payload-bearing Element kind gets its own arena. The interner exposes per-kind intern + lookup methods. Trivial-kind elements (no payload) don't need interning ; their handles are constructed directly from the kind tag and the canonical slot, and the prelude exposes them as constants.

The Type interner

The Type arena is keyed on the content slice plus the flags. Two Types with the same elements (in any order) and the same flags are the same TypeId.

Type interning does:

Sort and dedup the input slice (canonicalisation).
Intern the sorted slice into the element-list slice arena to get a stable handle.
Intern the (slice_handle, flags) pair into the Type arena to get a slot.
Construct the TypeId from (slot, flags, meta=0, reserved=0).

The sort+dedup step uses sort_unstable + dedup, with a SIMD prefilter that skips both when the input is already canonical. Most analyser-side calls hit the fast path because the join layer produces canonical output.

A singleton-Type cache is layered on top: interning a one-element slice is a hashmap lookup keyed on the single Element ID, skipping the sort + the dashmap hash entirely.

Memory growth

Arenas only grow. There is no GC, no reference counting, no compaction. The trade-off:

Pro: every handle is stable for the process lifetime; no use-after-free; no churn.
Con: a long-running analyser session that processes many distinct types accumulates them all.

In practice, the analyser's working set saturates: most types are constructed once and reused many times, and the arena converges. Mago's expected memory budget for the type arenas is in the tens of megabytes for a large monorepo.

If the analyser truly needs per-session deallocation (e.g. an LSP server processing many distinct codebases), the recommendation is to spawn a fresh process per session. Suffete does not support arena resets within a process.

Concurrency

The dashmap is the concurrency point. Two threads interning the same value race on the dashmap entry; whichever wins inserts, the other reads back the winner's slot. Both end up with the same handle.

The arenas themselves (boxcar) are append-only and lock-free for read; writes synchronise minimally. Two threads pushing different values append independently.

Performance numbers

Approximate, on a modern x86_64 desktop:

Trivial-kind handle access: zero ; it's a const.
Cache-hit intern: ~30ns.
Cache-miss intern (first time this payload is seen): ~100ns plus the cost of the boxed clone if the content is fresh.
Singleton-Type cache hit: ~10ns.
Slow-path Type intern: ~100-300ns depending on slice length.
Payload lookup by handle: ~5ns; one boxcar index plus a deref.

The lattice's hot loops are dominated by the family rules, not interning. Interning is the constant-cost overhead.

See also: The ElementId tag layout, SIMD scans, Performance philosophy.

The ElementId tag layout

ElementId is a NonZeroU32 newtype. The 32 bits are split into two fields:

bit  31 30 29 28 27 26 | 25 24 23 ... 2 1 0
     [   kind tag    ] | [   arena slot   ]
       6 bits             26 bits

Field	Bits	Width	Range
Kind tag	31..26	6	1..=63 (64 values reserved; 30+ used)
Arena slot	25..0	26	0..=2^26-1 ≈ 67 million

The kind tag is 1-based: tag 0 is reserved as the NonZeroU32 niche, so ElementId(0) is impossible and Option<ElementId> is the same size as ElementId.

Why this layout

Three reasons:

One-shot dispatch. A kind() call is a right-shift by 26 plus a cast through ElementKind. No memory access, no branch.
Compact ElementId. 32 bits per Element handle means the union body for an n-element type takes 4n bytes ; a hot cache line fits 16 Elements.
SIMD-friendly. A slice of ElementId is a slice of u32. AVX2 processes 8 lanes per 256-bit register; NEON processes 4 per 128-bit. The kind extraction is a vectorised right-shift; the equality scan is a vectorised compare. See SIMD scans.

Why 6 bits for the kind

PHP's type system has roughly 30 distinct Element kinds. 6 bits gives 63 usable values (excluding the niche), which leaves room for growth. 5 bits would have been tight; 7 would have wasted a bit.

Why 26 bits for the slot

26 bits gives 67 million slots per kind. The largest arena in any real-world analyser is the Object arena (every distinct (name, type_args, flags) tuple is one entry), which tops out in the millions on the largest codebases. 67 million is far beyond what any analyser is expected to need.

The slot space is per-kind, so two Elements with the same arena-slot index but different kinds are still distinct ElementIds.

How `kind()` works

impl ElementId {
    const KIND_BITS: u32 = 6;
    pub(super) const SLOT_BITS: u32 = u32::BITS - Self::KIND_BITS;  // 26
    const SLOT_MASK: u32 = (1u32 << Self::SLOT_BITS) - 1;            // 0x03FF_FFFF

    pub const fn kind(self) -> ElementKind {
        let tag = (self.0.get() >> Self::SLOT_BITS) as u8;
        // SAFETY: every ElementId is constructed from a valid ElementKind.
        unsafe { core::mem::transmute(tag) }
    }
}

The transmute is sound because every ElementId is constructed via ElementId::new(kind, slot), which sets the high bits to kind as u8 << 26. Tag values that don't correspond to an ElementKind discriminant cannot occur.

Why the kind tag is in the high bits

Two reasons:

Sort order matches kind order. Sorting a &[ElementId] by raw u32 value clusters Elements by kind first, then by slot within kind. The interner relies on this for fast canonical-form detection (the simd::is_sorted_strict primitive checks adjacent strict ordering).
SIMD shift access. AVX2's vpsrld (packed shift right logical, dword) takes an immediate count; right-shifting 8 lanes by 26 in one instruction extracts the kind tag from each lane simultaneously.

If the tag were in the low bits, a left-shift would be needed to compare against a ElementKind value, but the resulting compare would still work; the high-bit choice is a small win for the SIMD path.

The `NonZero` niche

NonZeroU32 requires the value to be non-zero. ElementId(0) is therefore impossible, which lets Option<ElementId> be 32 bits (the niche stores None as the all-zero pattern).

The kind tag is 1-based: ElementKind::Null is discriminant 1, not 0. Since the tag occupies the high 6 bits, the all-zero u32 would correspond to "kind 0, slot 0", which is never a valid Element ; the niche is naturally available.

TypeId is similar but bigger

TypeId is NonZeroU64, with this layout:

bit 63 ... 32 | 31 ... 16 | 15 ... 8 | 7 ... 0
[ slot: 32 ]  | [flags:16] | [meta:8] | [reserved: 8]

slot: 32 bits ; index into the type-content arena.
flags: 16 bits ; FlowFlags bitset.
meta: 8 bits ; consumer-defined.
reserved: 8 bits ; always zero.

Flags and meta ride on the handle (they don't change the slot), so toggling a flag is bit-twiddling, not a re-intern.

A subtle case: `TypeId` equality

Two TypeIds compare equal iff all 64 bits match. That includes flags and meta. To compare just content: t1.content_eq(&t2) ignores flags and meta. The full == is for "are these the exact same handle for use in caches keyed on the full handle".

See also: Interning and the arenas, SIMD scans.

SIMD scans

The element::simd module provides SIMD-accelerated scans over &[ElementId] slices. Six primitives, all of them dispatch through scalar fallback / AVX2 (x86_64) / NEON (aarch64), with hand-tuned inline assembly on the AVX2 path.

This is the lowest level of the crate. The lattice and predicate code consume these primitives; the user-facing API is unaware of them.

The primitives

Function	Returns	Used by
`contains(slice, needle)`	`bool` ; needle anywhere in slice	join's canonicalize, predicates
`position_of(slice, needle)`	`Option<usize>` ; first index of needle	TypeBuilder::remove, TypeBuilder::replace
`any_of_kind(slice, kind)`	`bool` ; some Element has this kind	predicates::contains_X, lattice prefilters
`all_of_kind(slice, kind)`	`bool` ; every Element has this kind (false on empty)	predicates::is_X
`count_of_kind(slice, kind)`	`usize` ; count of Elements with this kind	join's literal-collapse threshold
`is_sorted_strict(slice)`	`bool` ; slice is strictly increasing under unsigned u32 order	interner's canonicality fast path

The dispatch:

x86_64 + AVX2 (runtime-detected via std::is_x86_feature_detected): 8 lanes (256 bits) per iteration. Hand-rolled inline assembly using FFmpeg's pointer-end trickery (one register holds a negative byte offset that doubles as both the load displacement and the loop counter).
aarch64 + NEON (baseline ISA, no runtime check): 4 lanes (128 bits) per iteration. NEON intrinsics.
Other architectures or short slices: tight scalar loop.

The thresholds are 8 lanes for AVX2 and 4 lanes for NEON. Below that, scalar wins because the SIMD setup outweighs the parallel work.

Why ElementId-as-u32 is a perfect fit

ElementId is a NonZeroU32 (layout chapter). A slice of ElementId is contiguous 32-bit lanes. Equality scans (contains, position_of) and kind scans (any_of_kind, all_of_kind, count_of_kind) reduce to a per-chunk:

Load 8 (AVX2) or 4 (NEON) lanes into a SIMD register.
(For kind scans) Right-shift by 26 to extract the kind tag.
Compare-equal against a broadcasted needle (or kind value).
Reduce to a scalar via movemask (AVX2) or maxv/minv (NEON).
Branch on the reduction.

The per-chunk cost is constant; the per-iteration overhead is one load, two-to-three SIMD ops, one branch.

The FFmpeg pointer-end trick

The AVX2 paths use a register-saving technique borrowed from FFmpeg. Instead of:

mov  rcx, 0
.loop:
  vmovdqu  ymm1, [rsi + rcx*4]
  ; ... compare ...
  inc  rcx
  cmp  rcx, r8        ; r8 = chunk count
  jb   .loop

The trick is:

mov  rdi, [end of slice]   ; pointer to end-of-chunked-region
mov  rcx, -bytes           ; negative byte offset
.loop:
  vmovdqu  ymm1, [rdi + rcx]
  ; ... compare ...
  add  rcx, 32             ; byte stride = 32 (8 lanes × 4)
  jl   .loop

The negative offset doubles as the loop counter and the addressing-mode displacement. The loop tail is add + jl (two instructions, one micro-op fusion) instead of inc + cmp + jb (three instructions).

The savings are small per iteration but real, especially on long slices.

NEON: intrinsics, not assembly

NEON lacks a per-lane movemask instruction; the scalar reduction is done with vmaxvq_u32 (max across lanes) or vminvq_u32 (min across lanes) plus a per-element-position bit-set trick. The intrinsics version is fast enough that hand-rolled assembly doesn't help.

NEON is baseline on AArch64 (every aarch64 CPU has it). No runtime detection needed; the SIMD path runs whenever the slice is long enough.

Where the primitives are called from

The SIMD primitives are called by hot paths the lattice traverses:

lattice::overlaps and lattice::refines use simd::any_of_kind as a prefilter for the Negated, Intersected, Mixed, Object, etc. families ; "if no Element has this kind, skip the family rule entirely".
lattice::join's canonicalisation uses simd::contains to detect well-known dominators (MIXED, NEVER, BOOL, RESOURCE, etc.) before applying the rule.
predicates::is_int, is_string, etc. use simd::all_of_kind as their core.
predicates::contains_int, contains_string, etc. use simd::any_of_kind.
intern_type's slow path uses simd::is_sorted_strict to skip the sort + dedup when the input is already canonical.
TypeBuilder::remove and TypeBuilder::replace use simd::position_of.

When the SIMD threshold isn't met

Most analyser-side unions are small (1-5 Elements). The threshold gates ensure that the SIMD code only runs when there's enough work to pay back the setup. On short slices, the scalar fallback runs ; LLVM autovectorises what it can.

Why hand-rolled assembly

The autovectoriser generates correct SIMD code, but it doesn't know:

That the loop bound is the chunk count (not the byte count).
That the broadcast can be done once outside the loop.
That vpcmpeqd + vpmovmskb + test + jnz is a tighter early-exit than the equivalent generated code.
That the FFmpeg pointer-end trick saves one micro-op per iteration.

For the AVX2 paths, hand-rolled assembly is consistently 10-30% faster than the autovectorised scalar on long slices. For the NEON paths, intrinsics-with-careful-loop are within 5% of hand-rolled assembly, and the maintenance is much lower.

Safety

Every SIMD function is marked unsafe fn. The public entry points are safe and gate on the threshold + the runtime feature detection (for AVX2). Inside the SIMD function:

Unaligned loads (vmovdqu, vld1q_u32) — both architectures support these without alignment.
Bounds: the function is called only when the slice is at least THRESHOLD lanes; the chunk count is len / lanes_per_chunk, capped to fit.
Tail handling: each function has a scalar tail loop for the leftover lanes after the chunked region.

The unsafe blocks are local to each function and have SAFETY comments documenting the invariants.

Performance numbers

Approximate, on a modern x86_64 desktop, for a slice of 64 Elements:

simd::contains (hit early): ~5ns.
simd::contains (miss to end): ~20ns.
simd::any_of_kind (hit early): ~6ns.
simd::any_of_kind (miss to end): ~25ns.
simd::is_sorted_strict (already sorted, 64 elements): ~30ns.
Scalar equivalent (64 elements): ~3-5x slower across the board.

For typical analyser unions (5-10 Elements), the scalar fallback runs and the cost is on the order of the per-Element work itself ; nanoseconds per call.

See also: The ElementId tag layout, Interning and the arenas, Performance philosophy.

Performance philosophy

Suffete is a hot path. An analyser checking a large codebase calls refines, overlaps, meet, join, subtract, narrow and the predicates millions of times. A constant-factor regression in any of those shows up as a percentage of total analysis time.

This chapter is the philosophy. Specific optimisations are in the interner, element-id layout, and SIMD chapters.

The principles

1. Performance is a feature, not an afterthought

Performance regressions stop landing in suffete the same way correctness regressions stop. Every PR runs the codspeed benchmark suite; a regression beyond a small threshold blocks the merge. The benchmarks cover the lattice operations on a representative mix of inputs (small unions, large unions, deep generics, intersections, narrowed mixed).

This is not "ship and optimise later". The cost model is part of the design.

2. Compact representation beats clever rules

Most of suffete's speed comes from:

ElementId is a u32. Equality is one compare.
TypeId is a u64. Equality is one compare.
Slices of ElementId are dense u32 arrays. SIMD-friendly.

The lattice rules themselves are not particularly clever ; they are the textbook subtype rules. The speed comes from the data being amenable to fast comparison and SIMD scans, not from algorithmic tricks in the rules.

3. Intern aggressively, free nothing

The interner never frees. Every ElementId and TypeId is 'static. The trade-off is committed: stable handles in exchange for accumulated memory.

In practice, the analyser's working set saturates. After ingesting a codebase, the arena is "warmed" and most subsequent queries hit existing entries. A long-running session converges; it does not grow without bound except as new types appear.

For sessions that need true per-context isolation (LSP servers across distinct codebases), the recommendation is process-per-session rather than arena-reset.

4. Pure functions, no global mutable state

Every operation is a pure function of its inputs (TypeIds, &World, LatticeOptions). Same inputs → same output. No internal state.

This buys:

Concurrency. The lattice can be called from any thread without coordination.
Memoisation. Callers can cache results keyed on the input handles, without worrying about invalidation.
Testability. Property tests work because the operations are deterministic. The algebraic-law battery checks identities that only hold for pure functions.

The single piece of mutable state that exists — the interner — is concurrent-safe and append-only. Reads see a consistent snapshot.

5. Hot paths are tuned, cold paths are clear

The hot paths (the lattice operations, the SIMD primitives, the interner's lookup) are optimised aggressively: hand-rolled assembly, careful branch ordering, prefilter gates, inline annotations. The cold paths (the construction APIs, the diagnostic helpers, the serialize layer) prioritise clarity over speed.

The split is informed by codspeed. A function that doesn't show up in the benchmarks gets readable code; a function that does gets the optimisation effort.

6. Codspeed is the source of truth

We do not benchmark locally. Local microbenchmarks vary by CPU, OS, background load, and disk cache. The numbers don't transfer.

Codspeed runs every PR on a controlled VM and reports relative deltas to the base commit. A "5% regression" means 5% on Codspeed's hardware, which is not your hardware, but is the same hardware as the previous run. The signal is reproducible.

Locally we ship hunches and let codspeed vote. If a change is hopefully an improvement, we push and look at the benchmark output. If codspeed flags it as a regression, we revert; if it's a win, we keep.

7. SIMD is for the things SIMD is good at

SIMD wins when the data is naturally u32 or u64 lanes, the operation is per-lane independent (compare-equal, shift-extract, etc.), and the slice is long enough to amortise the setup. SIMD does not win when the data is heterogeneous (each Element triggers a different per-kind branch), the work per Element is heavyweight (a recursive refines call cannot be vectorised), or the slice is short.

Suffete uses SIMD for the slice scans and as prefilters in the lattice (does this slice contain any Negated Element? gate the family rule on the answer). The lattice's per-pair work is not SIMD; it can't be.

8. Allocation is a tax

Every allocation is a roundtrip to the allocator. The lattice's hot path is allocation-free: handles are passed by value, the report is mutated in-place, the workspace lives on the stack.

Callers should follow the same: reuse LatticeReport instances across queries, reuse TypeBuilder instances when constructing many types, prefer single-element prelude constants over building one-element types from scratch.

9. Branches are a tax too

Modern CPUs predict branches well, but predictable branches are a constant cost; mispredicted branches are 10-20× worse. Suffete's hot loops minimise both:

Family dispatch in the lattice is a match on (a.kind(), b.kind()), which the compiler turns into a jump table for the common case.
The universal axioms (top, bot, placeholder) are checked first; once they fire, the rest of the dispatch is dead code.
The SIMD scans use early-exit: on a match, branch out of the loop; on a miss, branch back to the loop start. The loop-back is well-predicted.

10. The lattice is not the only thing that matters

Suffete is one piece of an analyser's pipeline. The analyser's parser, the docblock interpreter, the symbol table, the codebase ingestion, the diagnostic emitter — all of these have their own performance budgets. Suffete's job is to be one constant-factor cost in that pipeline, not to dominate it.

If suffete becomes the bottleneck of a real analyser, that's a bug. The expected cost of suffete in a representative analysis is measured in low single-digit percent of total analyser time.

What doesn't matter

A few things that seem like they should matter, but don't:

The exact number of cycles per refines call. It varies by case. The codspeed delta is what matters.
Avoiding small allocations in cold paths. Vec::new is cheap. Prematurely optimising it makes the code worse and saves nothing.
Inlining everything. #[inline] helps when LLVM doesn't see the benefit; it hurts when overused (code bloat, instruction cache pressure). Suffete uses #[inline] for the small-and-hot functions and lets LLVM decide for the rest.
Lock-free data structures everywhere. The interner uses dashmap because it's a concurrent hashmap, not because every read needs to be lock-free. The lattice's hot path doesn't take locks at all.

When in doubt

Ship it. Look at codspeed. Revert if it regresses; keep it if it doesn't. Don't theorise about performance ; measure.

See also: Interning and the arenas, The ElementId tag layout, SIMD scans.

Element kinds

Every Element is one of the following kinds. The kind is the high 6 bits of the ElementId ; see the layout chapter. Trivial-kind Elements have no payload; payload-bearing kinds carry a &'static SomeInfo reference into the matching arena.

This is reference material, organised by family.

Trivial kinds (no payload)

Kind	Prelude constant	Denotes
`Null`	`NULL`	The value `null`.
`Never`	`NEVER`	The empty type ($\bot$).
`Void`	`VOID`	The PHP `void` (absence of return value).
`Placeholder`	`PLACEHOLDER`	Inference-time hole.
`Bool`	`BOOL`	`true` or `false`.
`True`	`TRUE`	The value `true`.
`False`	`FALSE`	The value `false`.
`Scalar`	`SCALAR`	`bool \| int \| float \| string`.
`Numeric`	`NUMERIC`	`int \| float \| numeric-string`.
`ArrayKey`	`ARRAY_KEY`	`int \| string`.
`ObjectAny`	`OBJECT_ANY`	Any object, no class commitment.
`Mixed`	`MIXED` (vanilla)	Universe top ($\top$), with optional axes via `MixedInfo`.

Mixed is technically payload-bearing (the MixedInfo carries the four axes), but the vanilla form (all axes default) is exposed as a constant and behaves trivially.

Scalar payload kinds

Kind	Payload	Denotes
`Int`	`IntInfo`	Integer (Unspecified, UnspecifiedLiteral, Literal(n), Range(IntRange)).
`Float`	`FloatInfo`	Float (Unspecified, Literal, NonZero).
`String`	`StringInfo`	String with literal slot + refinement flags.
`ClassLikeString`	`ClassLikeStringInfo`	A string naming a class-like (kind + specifier).

See scalars and class-like strings.

Object family

Kind	Payload	Denotes
`Object`	`ObjectInfo`	Named class (with optional type args, intersections, modality flags).
`Enum`	`EnumInfo`	Enum (with optional case).
`ObjectShape`	`ObjectShapeInfo`	Structural object type (with known properties, sealed/unsealed).
`HasMethod`	`HasMethodInfo`	Anything declaring a method named `m`.
`HasProperty`	`HasPropertyInfo`	Anything declaring a property named `p`.

See objects.

Collection family

Kind	Payload	Denotes
`Array`	`KeyedArrayInfo`	Keyed array (generic, sealed shape, or empty).
`List`	`ListInfo`	Int-keyed list (generic or sealed).
`Iterable`	`IterableInfo`	`iterable<K, V>`.

See arrays and iterables and callables.

Resource

Kind	Payload	Denotes
`Resource`	`ResourceInfo`	PHP resource (open / closed, named kind).

See resources.

Callable

Kind	Payload	Denotes
`Callable`	`CallableInfo`	Bare, signature, or closure form.

See iterables and callables.

Wrappers

Kind	Payload	Denotes
`Negated`	`NegatedInfo`	Set complement: `!T`.
`Intersected`	`IntersectedInfo`	Head + conjunct list: `H & C1 & ... & Cn`.

See wrappers.

Generics

Kind	Payload	Denotes
`GenericParameter`	`GenericParameterInfo`	Free template parameter (`T`).

See templates.

Unresolved

Kind	Payload	Denotes
`Alias`	`AliasInfo`	Type alias (resolved via `World::resolve_alias`).
`Reference`	`SymbolReference`	Name reference (class, alias, parameter, etc.).
`MemberReference`	`MemberReference`	Class-member type (`Foo::T`).
`GlobalReference`	`GlobalReference`	Global type variable.
`Conditional`	`ConditionalInfo`	`T extends U ? X : Y`.
`Derived`	`DerivedInfo`	`KeyOf<T>`, `ValueOf<T>`, `IndexAccess<T, K>`, etc.
`Variable`	`VariableInfo`	Analyser-introduced inference placeholder.

See unresolved elements and expand.

How to enumerate

use suffete::ElementKind;

// Every variant in declaration order.
for k in ElementKind::iter() {
    println!("{:?}", k);
}

The total count of variants is the size of core::mem::variant_count::<ElementKind>(). The maximum tag value is one less than the count.

Constructing each kind

For trivial kinds, use the prelude constants. For payload-bearing kinds, use the ElementId constructors when available, or interner().intern_* directly:

use suffete::{ElementId, prelude::INT};
use suffete::interner::interner;
use suffete::element::payload::*;
use mago_atom::atom;

// Trivial.
let _ = suffete::prelude::NULL;
let _ = INT;

// Convenience constructors.
let _ = ElementId::int_literal(7);
let _ = ElementId::string_literal("hello");
let _ = ElementId::named_object(atom("Foo"));
let _ = ElementId::named_object_with_args(atom("Box"), &[suffete::prelude::TYPE_INT]);
let _ = ElementId::enum_case(atom("Status"), atom("Active"));

// Direct interner.
let info = ResourceInfo { state: ResourceState::Open, kind: Some(atom("curl")) };
let _ = interner().intern_resource(info);

The full list of constructors is in src/element/id.rs; the per-kind interner methods are generated by the element_arena_methods! macro in src/interner/store.rs.

See also: Elements: the indivisible types for the conceptual overview; The ElementId tag layout for the bit layout; Prelude constants for the full prelude.

Lattice options and reports

The lattice operations take two extra parameters beyond the types and the world: LatticeOptions and a &mut LatticeReport. This chapter documents both.

`LatticeOptions`

The configuration knobs for a single lattice query. All fields default to "strict reading"; the analyser sets them based on the file's strictness mode and any per-call overrides.

use suffete::lattice::LatticeOptions;

let opts = LatticeOptions::default();
let opts = opts.with_php_runtime_coerce(false);  // strict mode
let opts = opts.with_ignore_null(true);          // ignore null in unions
let opts = opts.with_ignore_false(true);         // ignore false in unions

The full set of options:

Field	Default	Effect
`php_runtime_coerce`	`true`	Admit PHP runtime coercion edges (`int → float`, `numeric-string → int/float`). Records `CoercionCauses::PHP_RUNTIME_COERCE` on the report when fired. Set to `false` for strict-types mode.
`ignore_null`	`false`	When checking `refines(τ, σ)`, drop `null` from `τ` before the per-Element check. Used by analyser code that has separately verified non-nullability.
`ignore_false`	`false`	Same for `false`. Used in nullable-via-false patterns.
`merge_list_element_types`	`true`	In join, merge sealed-list element types where possible.
`merge_keyed_array_params`	`true`	In join, merge keyed-array key/value parameters where possible.
`int_literal_collapse_threshold`	(small)	The number of distinct int literals at which join collapses to `int` or a range.
`string_literal_collapse_threshold`	(small)	Same for string literals.

The with_* methods return a new LatticeOptions ; the type is Copy, so chaining is cheap.

`LatticeReport`

A buffer the lattice writes into during a query. It carries:

Coercion causes ; a bitset of which special rules fired, suitable for the analyser to surface in a diagnostic.

use suffete::lattice::{LatticeOptions, LatticeReport, CoercionCauses};

let mut report = LatticeReport::new();
// ... call lattice operations, passing &mut report ...

if report.causes.contains(CoercionCauses::PHP_RUNTIME_COERCE) {
    // The check passed but used a runtime-coercion edge.
    // Surface a warning if the analyser's policy requires.
}

if report.causes.contains(CoercionCauses::TEMPLATE_DEFAULT) {
    // The check passed but used a default-filled template parameter.
}

`CoercionCauses`

A bitflags type:

Cause	Meaning
`PHP_RUNTIME_COERCE`	A PHP-runtime coercion edge was used (e.g. `int → float`).
`TEMPLATE_DEFAULT`	A default-filled template parameter was tolerated.
`IGNORE_NULL`	The query ran with `null` ignored.
`IGNORE_FALSE`	The query ran with `false` ignored.

The bits are non-zero when the corresponding rule contributed to the answer. The analyser reads the bitset after the query and decides what to do.

Reusing reports

LatticeReport::new() is cheap (just a zeroed bitset). For analyser hot loops, reuse one instance:

let mut report = LatticeReport::new();
for query in queries {
    report.reset();   // clear the bitset
    let _ = lattice::refines(query.a, query.b, &world, opts, &mut report);
    // ... handle the result and the report ...
}

reset clears the bitset in O(1).

`FlowFlags`

A 16-bit bitset that rides on every TypeId. Stored in the flags field of the handle's u64 representation.

use suffete::FlowFlags;

let f = FlowFlags::EMPTY;
let f = f.with_from_template_default(true);
let f = f.with_isset_from_loop(true);
let f = f.with_explicitly_nullable(true);

The full set of flags:

Flag	Meaning
`from_template_default`	This type-arg was filled with the parameter's upper bound rather than the user's value. The variance check tolerates it (recording `CoercionCauses::TEMPLATE_DEFAULT`).
`isset_from_loop`	This value flowed through a loop body. Used by the analyser's flow-typing to track variables introduced inside a loop.
`explicitly_nullable`	The type was constructed as nullable via `?T` syntax (rather than as a separate `T \| null` union). Carried for diagnostic precision.

The flags do not affect lattice operations directly ; they are metadata the analyser reads out via TypeId::flags(). The exception is from_template_default, which the lattice's variance check consults.

A worked example

use suffete::{TypeBuilder, prelude::{INT, FLOAT}};
use suffete::{lattice::{self, LatticeOptions, LatticeReport, CoercionCauses}, world::NullWorld};

let int_t = TypeBuilder::new().push(INT).build();
let float_t = TypeBuilder::new().push(FLOAT).build();

let world = NullWorld;
let mut report = LatticeReport::new();

// Strict mode: int does not refine float.
let opts_strict = LatticeOptions::default().with_php_runtime_coerce(false);
assert!(!lattice::refines(int_t, float_t, &world, opts_strict, &mut report));

// Non-strict mode: int coerces.
report.reset();
let opts_loose = LatticeOptions::default();  // php_runtime_coerce = true
assert!(lattice::refines(int_t, float_t, &world, opts_loose, &mut report));
assert!(report.causes.contains(CoercionCauses::PHP_RUNTIME_COERCE));

The same int <: float query, two different opts, two different answers. The report tells the analyser why the loose answer is loose.

See also: refines, meet, join, subtract, narrow, overlaps for the operations that consume these.

Prelude constants

The suffete::prelude module exposes well-known constants. Importing it gives you the common-case Elements and Types without going through the interner.

use suffete::prelude::*;

Elements (`ElementId`)

Landmarks

Constant	Element	Notes
`NEVER`	`Never`	$\bot$.
`MIXED`	`Mixed` (vanilla)	$\top$.
`NULL`	`Null`	The value `null`.
`VOID`	`Void`	The PHP `void`.
`PLACEHOLDER`	`Placeholder`	Inference hole.

Booleans

Constant	Element
`TRUE`	`True`
`FALSE`	`False`
`BOOL`	`Bool`

Scalars (unrefined)

Constant	Element
`INT`	`Int` (unspecified)
`FLOAT`	`Float` (unspecified)
`STRING`	`String` (unspecified, no flags)
`NUMERIC`	`Numeric` (true union)
`SCALAR`	`Scalar` (true union)
`ARRAY_KEY`	`ArrayKey` (true union)

Constant	Element
`NUMERIC_STRING`	`String` (unspecified, with `is_numeric=true`)
`NON_EMPTY_STRING`	`String` (unspecified, with `is_non_empty=true`)
`EMPTY_STRING`	`String` (literal `""`)
`INT_ZERO`	`Int` (literal `0`)

Object family

Constant	Element
`OBJECT_ANY`	`ObjectAny`

Iterable / callable

Constant	Element
`ITERABLE_MIXED_MIXED`	`Iterable<mixed, mixed>`
`CALLABLE`	`Callable` (bare form, no signature)

Resources

Constant	Element
`RESOURCE`	`Resource` (unrefined)
`OPEN_RESOURCE`	`Resource` (state = Open, no kind)
`CLOSED_RESOURCE`	`Resource` (state = Closed, no kind)

Arrays

Constant	Element
`EMPTY_ARRAY`	`array{}` (the empty sealed shape)

Types (`TypeId`)

Each TYPE_* constant is the singleton type wrapping the corresponding Element. Use these when you need a TypeId for a single Element ; it saves the construction call.

Constant	Type	Equivalent
`TYPE_NEVER`	`never`	`TypeBuilder::new().push(NEVER).build()`
`TYPE_MIXED`	`mixed`	`TypeBuilder::new().push(MIXED).build()`
`TYPE_NULL`	`null`	...
`TYPE_VOID`	`void`	...
`TYPE_TRUE`	`true`	...
`TYPE_FALSE`	`false`	...
`TYPE_BOOL`	`bool`	...
`TYPE_INT`	`int`	...
`TYPE_FLOAT`	`float`	...
`TYPE_STRING`	`string`	...
`TYPE_NUMERIC`	`numeric`	...
`TYPE_SCALAR`	`scalar`	...
`TYPE_ARRAY_KEY`	`array-key`	...
`TYPE_NUMERIC_STRING`	`numeric-string`	...
`TYPE_NON_EMPTY_STRING`	`non-empty-string`	...
`TYPE_OBJECT_ANY`	`object`	...
`TYPE_RESOURCE`	`resource`	...
`TYPE_OPEN_RESOURCE`	`open-resource`	...
`TYPE_CLOSED_RESOURCE`	`closed-resource`	...
`TYPE_EMPTY_ARRAY`	`array{}`	...

The TYPE_* constants are computed at boot time and exposed as consts. Comparing t == TYPE_INT is a single u64 compare ; faster than constructing a TypeBuilder and calling build.

When to use the prelude

For the trivial cases:

use suffete::prelude::*;

// Use the Element constant when constructing a union:
let t = TypeBuilder::new().push(INT).push(STRING).build();

// Use the Type constant when you need a TypeId directly:
let int_only: TypeId = TYPE_INT;

// Use the Type constant in operations:
let result = lattice::refines(some_t, TYPE_MIXED, &world, opts, &mut report);

For non-trivial Elements (named objects, sealed shapes, callable signatures), use the ElementId constructors or the interner directly. The prelude only covers the well-known singletons.

Naming conventions

UPPER_SNAKE_CASE for ElementId constants.
TYPE_UPPER_SNAKE_CASE for TypeId constants.
The element name matches the PHP-side name, lowercased and snake-cased: INT, STRING, OBJECT_ANY, EMPTY_ARRAY, NUMERIC_STRING.

Adding to the prelude

The prelude is the place for frequently-needed constants. New entries should:

Have a clearly-defined PHP-side meaning (so the name is unambiguous).
Be a singleton (one canonical Element / Type per name).
Get used often enough that an analyser shouldn't have to construct it inline.

Refinement variants that can be expressed as flag combinations on the basic kinds (e.g. truthy-string) are typically not in the prelude ; they are constructed via the interner with a StringInfo carrying the flags.

See also: Constructing types: TypeBuilder and prelude for the broader construction API; Element kinds for the full kind list.

Contributing

The book lives in book/ of the suffete repository. It's an mdbook with mathjax and mermaid enabled. Contributions are welcome — chapter improvements, fixes for incorrect descriptions, new cookbook recipes, missing diagrams.

Building locally

Install mdbook and the mermaid preprocessor:

cargo install mdbook
cargo install mdbook-mermaid

Then, from the repo root:

cd book
mdbook-mermaid install .   # one-time, copies mermaid assets into theme/
mdbook serve --open        # builds and watches; opens in browser

The book rebuilds on file changes; the browser auto-reloads.

Style

Match the existing chapters:

Dense but readable. No filler. Short paragraphs. Real examples in every chapter.
Math via mathjax. $\tau \mathrel{<:} \sigma$ , $\tau \sqcup \sigma$ , etc.
Diagrams via mermaid when they clarify (not as decoration).
Code samples are rust,ignore ; the book is not a doctest harness.
PHP samples are php ; PHP is not compiled by mdbook either, but the syntax highlighting is helpful.
"Element" not "atom" in code-related text. The atom note is in the introduction and glossary.
No emojis anywhere ; book or commit messages.
No em-dashes anywhere.
Cross-link to other chapters when a concept comes up that has its own treatment.
Every chapter ends with a "see also" footer.

Naming chapter files

Chapter files live in book/src/<part>/<name>.md. The file name is kebab-case. The SUMMARY.md references each chapter by its full path.

When you add a chapter, register it in SUMMARY.md ; otherwise it doesn't render.

Verifying

Before submitting:

cd book
mdbook build       # checks links, mermaid, mathjax
mdbook test        # smoke-tests code samples (rust,ignore is skipped, but the parser runs)

Open book/book/index.html in your browser and click through every page to check that math and diagrams render. CI does the same on every PR.

Where to file issues

Issues that affect the book go to the suffete repository with a book label. PRs touching book/ trigger the book CI workflow.

The Suffete Book