regalloc2

Module checker

source
Expand description

Checker: verifies that spills/reloads/moves retain equivalent dataflow to original, VReg-based code.

The basic idea is that we track symbolic values as they flow through spills and reloads. The symbolic values represent particular virtual registers in the original function body presented to the register allocator. Any instruction in the original function body (i.e., not added by the allocator) conceptually generates a symbolic value “Vn” when storing to (or modifying) a virtual register.

A symbolic value is logically a set of virtual registers, representing all virtual registers equal to the value in the given storage slot at a given program point. This representation (as opposed to tracking just one virtual register) is necessary because the regalloc may implement moves in the source program (via move instructions or blockparam assignments on edges) in “intelligent” ways, taking advantage of values that are already in the right place, so we need to know all names for a value.

These symbolic values are precise but partial: in other words, if a physical register is described as containing a virtual register at a program point, it must actually contain the value of this register (modulo any analysis bugs); but it may describe fewer virtual registers even in cases where one could statically prove that it contains a certain register, because the analysis is not perfectly path-sensitive or value-sensitive. However, all assignments produced by our register allocator should be analyzed fully precisely. (This last point is important and bears repeating: we only need to verify the programs that we produce, not arbitrary programs.)

Operand constraints (fixed register, register, any) are also checked at each operand.

§Formal Definition

The analysis lattice consists of the elements of 𝒫(V), the powerset (set of all subsets) of V (the set of all virtual registers). The ⊤ (top) value in the lattice is V itself, and the ⊥ (bottom) value in the lattice is ∅ (the empty set). The lattice ordering relation is the subset relation: S ≤ U iff S ⊆ U. These definitions imply that the lattice meet-function (greatest lower bound) is set-intersection.

(For efficiency, we represent ⊤ not by actually listing out all virtual registers, but by representing a special “top” value, but the semantics are the same.)

The dataflow analysis state at each program point (each point before or after an instruction) is:

  • map of: Allocation -> lattice value

And the transfer functions for instructions are (where A is the above map from allocated physical registers to lattice values):

  • Edit::Move inserted by RA: [ alloc_d := alloc_s ]

    A’ = A[alloc_d → A[alloc_s]]

  • statement in pre-regalloc function [ V_i := op V_j, V_k, … ] with allocated form [ A_i := op A_j, A_k, … ]

    A’ = { A_k → A[A_k] \ { V_i } for k ≠ i } ∪ { A_i -> { V_i } }

    In other words, a statement, even after allocation, generates a symbol that corresponds to its original virtual-register def. Simultaneously, that same virtual register symbol is removed from all other allocs: they no longer carry the current value.

  • Parallel moves or blockparam-assignments in original program [ V_d1 := V_s1, V_d2 := V_s2, … ]

    A’ = { A_k → subst(A[A_k]) for all k } where subst(S) removes symbols for overwritten virtual registers (V_d1 .. V_dn) and then adds V_di whenever V_si appeared prior to the removals.

To check correctness, we first find the dataflow fixpoint with the above lattice and transfer/meet functions. Then, at each op, we examine the dataflow solution at the preceding program point, and check that the allocation for each op arg (input/use) contains the symbol corresponding to the original virtual register specified for this arg.

Structs§

Enums§

  • A single error detected by the regalloc checker.