Inference: standin and infer
When the user calls a generic function or method without supplying explicit type arguments — id($x) rather than id<int>($x) — the analyser must infer the type arguments from the call-site arguments. Two primitives do the work:
- standin — a single inference round: walk a parameter type against an argument type, and accumulate the bounds the argument imposes on each free template parameter.
- infer — apply a fully-determined template environment to a type.
standin: one round of inference
standin is one round — one parameter, one argument, one walk. The analyser typically runs standin once per call-site argument, against the matching parameter type, before reading the accumulated environment.
The standin environment
The environment collects two kinds of bound for each template parameter:
- Lower bounds ($\sigma \mathrel{<:} T$): types the argument provided that the parameter must accept.
- Upper bounds ($T \mathrel{<:} \sigma$): types the parameter must produce that the argument is forced to expose.
The "upper" / "lower" terminology comes from the lattice direction:
graph BT
Lower["argument-side type<br/>(lower bound on T)"] --> T["T (the parameter)"]
T --> Upper["constraint type<br/>(upper bound on T)"]
A simple call id<T> invoked with id(7) accumulates a lower bound int <: T (the argument forces T to be at least int).
How standin walks
Like substitution, standin is a structural recursion. It walks the parameter type and the argument type together, in lockstep, and at each leaf — every template-parameter Element on the parameter side — it records a bound.
The recursion respects the same nesting substitution does: object type-args, list element types, callable signatures, intersections, conditionals, and so on.
When the parameter and argument structures don't align (a parameter expecting array<K, V> matched against an argument of type int), the walker silently fails to record bounds — there is no way to derive a bound from a structural mismatch. The analyser sees an empty (or partial) environment and reports a type error in its own layer.
infer: applying a determined environment
After all standin rounds complete, the analyser has an environment containing every bound. To produce the inferred type arguments, it:
- For each parameter
Twith one or more lower bounds, joins them:T := lo_1 ⊔ lo_2 ⊔ … ⊔ lo_n. - Verifies the inferred
Tagainst any upper bounds: eachT <: hi_imust hold. - Verifies the inferred
Tagainst the parameter's declared upper bound (the constraint). - Substitutes the inferred values back into the function's return type.
infer is the convenience that walks a type and substitutes every template-parameter Element using the joined lower bounds in the environment. Parameters with no lower bound are filled with their declared upper bound (or mixed).
Worked example: identity function
/**
* @template T
* @param T $x
* @return T
*/
function id($x) { return $x; }
id(7); // T inferred to int(7) (or generalised to int).
id("hello"); // T inferred to literal "hello" (or generalised to string).
id($foo); // where $foo: Foo. T inferred to Foo.
For id(7):
- The parameter type is
T; the argument type isint(7). standinwalks them together; the parameter side is theTElement, so the environment recordsT → lower bound = int(7).infersubstitutes the function's return type (T) with the joined lower bound, producingint(7).
Worked example: collection
/**
* @template T
* @param array<T> $xs
* @return T
*/
function head(array $xs) { /* ... */ }
head([1, 2, 3]); // T → int<1,3>
head(['a', 'b']); // T → 'a'|'b' (or generalised to string)
head([new Foo(), new Bar()]); // T → Foo|Bar
For head([1, 2, 3]):
- The parameter type is
array<T>; the argument type isarray<int, int<1,3>>(the literal array's inferred type). standinrecurses into the array's value parameter: parameter side isT, argument side isint<1, 3>.- The environment records
T → lower bound = int<1, 3>. infersubstitutes the return typeTwithint<1, 3>.
Worked example: multiple arguments
/**
* @template T
* @param T $a
* @param T $b
* @return array{0: T, 1: T}
*/
function pair($a, $b): array { /* ... */ }
pair(1, "hello"); // T → int(1)|"hello" (generalised to int|string)
The analyser runs standin once per argument:
- First argument: parameter
T, argumentint(1). Environment:T → [int(1)]. - Second argument: parameter
T, argument"hello". Environment:T → [int(1), "hello"].
infer joins the lower bounds, producing int(1) | "hello".
Variance and standin
Variance affects which side of the lattice direction standin records the bound on:
- Covariant position: argument is a lower bound on
T. - Contravariant position: argument is an upper bound on
T. - Invariant position: argument is both a lower and upper bound (must be exact match).
flowchart LR
Cov["param: Iterator<T> (T covariant)<br/>arg: Iterator<int>"] --> CovBound["T ≥ int (lower bound)"]
Contra["param: Sink<T> (T contravariant)<br/>arg: Sink<int>"] --> ContraBound["T ≤ int (upper bound)"]
Inv["param: Box<T> (T invariant)<br/>arg: Box<int>"] --> InvBound["T = int (both)"]
The standin walker queries the world for the variance of each parameter when descending into a parameterised type.
Why two operations
Conceptually, inference is one operation: "given parameter and arguments, return the inferred environment". The split into standin (per-argument) and infer (final synthesis) lets the analyser:
- Run multiple standin rounds across multiple arguments.
- Inject extra bounds from other sources (e.g. a hand-written
@template T of inton the call site). - Verify intermediate states (e.g. abort inference early if a contradiction is found).
- Apply the result to multiple types (the return type, the throws clause, derived types).
A subtle case: variables in arguments
Suppose the analyser is inferring inside a generic function body:
/**
* @template T
* @param T $x
* @return T
*/
function id($x) { return $x; }
/**
* @template U
* @param U $y
* @return U
*/
function wrap($y) { return id($y); }
For wrap's body call id($y):
- The argument is
$yof typeU(a template-parameter Element). - The parameter is
id::T(a different template-parameter Element). standinrecords:id::T → lower bound = U.
The lower bound is itself a template-parameter Element (since U is still free in wrap's body). The inferred return type for id($y) is U. When wrap is later invoked with a concrete argument, the chain resolves.
See also: substitute for the substitution operation
inferuses internally; specialise for inheritance-binding resolution; variance for how variance is consulted during walking.