Building a union from scratch
Three ways to construct a union type, depending on what the analyser is doing.
Method 1: literal construction with TypeBuilder
For when the analyser knows the elements ahead of time (a parsed PHP type expression, a hand-coded fixture):
use suffete::{TypeBuilder, prelude::{INT, STRING, NULL}};
let int_or_string_or_null = TypeBuilder::new()
.push(INT)
.push(STRING)
.push(NULL)
.build();
build is structural: it sorts and dedups but does not apply join's canonicalisation rules (no range merging, no literal collapse, no subtype absorption).
If the analyser wants those rules applied:
let canonical = TypeBuilder::new()
.push(INT)
.push(STRING)
.push(NULL)
.build_canonical();
build_canonical runs the join's canonicalisation pass: subsumption, range merging, literal collapse, etc. Use it when the analyser wants the smallest representation.
Method 2: incremental, via join::compute
For when the analyser is producing a union from a set of computed values (e.g. the union of a switch's case values, or the join of a control-flow merge):
use suffete::{TypeId, join};
fn join_all(types: &[TypeId]) -> TypeId {
let mut elements = Vec::new();
for &t in types {
elements.extend_from_slice(t.as_ref().elements);
}
TypeId::union(&join::compute(&elements))
}
join::compute takes the combined element multiset and returns the canonical element list. Wrapping it in TypeId::union interns the result. The result is the canonical union of every input.
The join::compute pass applies the canonicalisation rules: range merging, literal collapse, etc.
Method 3: from_type for incremental modification
For when the analyser already has a union and wants to push one or two more elements:
use suffete::{TypeBuilder, prelude::NULL};
let nullable = TypeBuilder::from_type(existing_type)
.push(NULL)
.build();
from_type enables the origin short-circuit: if the buffer is unchanged after the mutations, build() returns the original TypeId without re-interning. Useful when the mutations might be no-ops.
Comparison of the three methods
| Method | Use when | Canonicalisation | Cost |
|---|---|---|---|
TypeBuilder::push * / build() | Constructing from known elements | Structural only (sort, dedup) | Sort + intern lookup |
TypeBuilder::push * / build_canonical() | Constructing from known elements, want canonical form | Full canonicalisation | Sort + canonicalise + intern |
join::compute over collected elements | Folding a sequence | Full canonicalisation | 1 intern + 1 canonicalise |
TypeBuilder::from_type / push / build() | Adding to an existing type | Structural only | Origin check + sort + intern |
For most analyser code, build (or build_canonical if you need the canonical form) is the right choice. The join::compute pattern is for pipelines where each input is computed lazily.
Worked example: union of switch case types
function f(int $kind): string {
return match ($kind) {
1, 2 => 'a',
3 => 'b',
default => 'c',
};
}
The analyser computes the type of each match-arm body and joins them to produce the function's return type:
let arm1 = ...; // type of 'a' = literal "a"
let arm2 = ...; // type of 'b' = literal "b"
let arm3 = ...; // type of 'c' = literal "c"
let return_t = join_all(&[arm1, arm2, arm3]);
// return_t == "a"|"b"|"c" (or, after literal collapse threshold, just `string`)
If the analyser wants to preserve the literals (no collapse), use TypeBuilder::push * / build. If the analyser wants the canonical form (let the lattice decide whether to collapse), use join::compute (which uses the lattice's literal-collapse threshold) or build_canonical.
Worked example: making a type nullable
use suffete::{TypeBuilder, prelude::NULL, predicates::contains_null};
fn make_nullable(t: TypeId) -> TypeId {
if contains_null(t) {
return t; // already nullable
}
TypeBuilder::from_type(t).push(NULL).build()
}
The contains_null check avoids the unnecessary intern lookup for already-nullable inputs. The from_type ensures the origin short-circuit fires when (paradoxically) the type was already nullable but we hit this code anyway.
Worked example: union of class names
The analyser has a list of classes the user declared in a docblock and wants to construct the type:
use suffete::{TypeBuilder, ElementId};
use mago_word::Word;
fn classes_to_union(names: &[Word]) -> TypeId {
let mut b = TypeBuilder::new();
for &name in names {
b.push(ElementId::object_named(name.as_bytes()));
}
b.build()
}
let t = classes_to_union(&[
mago_word::word(b"Foo"),
mago_word::word(b"Bar"),
mago_word::word(b"Baz"),
]);
// t == Foo | Bar | Baz
If two of the classes are related by inheritance, build_canonical (or a separate join::compute pass) would absorb the descendant ; build keeps them distinct.
Performance notes
TypeBuilder::pushis O(1).buildis O(n log n) for the sort plus the interner cost. The interner has a fast path that detects already-sorted-and-unique input via SIMD.build_canonicalis O(n²) worst case (subsumption is pairwise), but typical inputs are small.join::computeis the canonicalisation cost over the collected element multiset, O(N).
For analyser hot loops constructing many unions, consider:
- Reusing one
TypeBuilderinstance across iterations (clear and re-push). - Pre-sorting if you know the input is canonical, so the interner's fast path fires.
- Choosing
buildoverbuild_canonicalif you don't need the canonical form ; the latter is strictly more expensive.
See also: TypeBuilder for the construction API; join for the canonicalisation rules; Predicates for the
contains_nulland friends used as fast pre-checks.