ADR 0023 - CorePure Expression Surface

Defines the closed Nix-like CorePure expression language, pipe sugar, string interpolation, and data-last stdlib rules.


On this page
  1. Status
  2. Context
  3. Decision
  4. Duplicate Names
  5. Pipe Sugar
  6. Data-Last Stdlib
  7. String Interpolation
  8. Layer Boundary
  9. Worked example
  10. Alternatives considered
  11. Consequences
  12. Positive
  13. Negative
  14. Obligations
  15. Related

ADR 0023 - CorePure Expression Surface

Status

Proposed - this ADR defines the initial CorePure authoring language for both elaboration-time reduction and delayed runtime output execution.

Context

Pure output equations need an expression language that is legible enough for real authoring but small enough to preserve deterministic replay, budget accounting, and proof obligations. Reusing a host language such as Haskell, JavaScript, or Nix would buy expressiveness but would also import IO, large libraries, evaluation complexity, or implementation-defined behavior.

The desired surface is Nix-like because Nix gives Cortex useful authoring conventions: attrsets, non-recursive let ... in, lambdas written as x: ..., and ergonomic string templates. CorePure is not Nix. It is a closed expression language with a typed evaluator and deterministic failure modes.

Decision

CorePure should use a closed Nix-like expression surface. The same AST and evaluator are used by the elaborator from ADR 0021 and by delayed runtime output equations from ADR 0050.

The initial language includes:

  • literals: numbers, strings, booleans, and null;
  • records / attrsets;
  • lists;
  • field access, such as item.score;
  • indexed access, such as items[0];
  • non-recursive let ... in;
  • if ... then ... else ...;
  • lambdas, written as x: ...;
  • curried application and partial application;
  • arithmetic, comparison, and boolean operators.

The initial language does not include tuple syntax. Pair-like values are represented as records.

Evaluation failures are typed and deterministic. The required failure set includes missing fields, type mismatches, divide by zero, bad indexes, arity mismatches, and budget exhaustion.

CorePure explicitly disallows:

  • recursion and fix;
  • imports, modules, filesystem access, or package access;
  • IO, time, random values, model calls, tool calls, memory queries, durable-state reads, or host callbacks;
  • unbounded fold;
  • sort with user comparators;
  • non-typed exceptions.

fold can be reconsidered only after budget accounting is solid enough to state its cost model.

Duplicate Names

CorePure rejects duplicate names at parse/elaboration time across every authoring surface that introduces names. No such surface uses silent right-biased or left-biased override. The only place in CorePure where one record’s keys may override another’s is the explicit record merge operator //, which is unaffected by this rule.

The duplicate-name rule applies to:

  • record literals;
  • let bindings within one scope;
  • lambda parameters within one lambda;
  • output equations within one node.

Within a record literal, two field declarations conflict when their paths are equal or when one path is a strict prefix of the other. The following are parse errors:

{ a = 1 ; a = 2 ; }                        -- equal paths
{ a.b = 1 ; a.b = 2 ; }                    -- equal paths
{ a = 1 ; a.b = 2 ; }                      -- `a` is a strict prefix of `a.b`
{ a.b = 1 ; a = { b = 2 ; } ; }            -- `a` is a strict prefix of `a.b`

The following is permitted because neither path is a prefix of the other; the distinct leaves combine into one nested record:

{ a.b = 1 ; a.c = 2 ; }                    -- equivalent to { a = { b = 1 ; c = 2 ; } ; }

The runtime evaluator never observes duplicate names within one scope, because the parser and elaborator reject them upstream. Shadowing across distinct nested scopes is unaffected: an inner let binding with the same name as an outer one shadows the outer per ADR 0023’s non-recursive scoping rules.

The decision follows Nix’s attrset semantics. Nix rejects { a = 1; a = 2; } and let x = 1; x = 2; in x at parse time regardless of recursive scope, and Nix’s // operator performs explicit right-biased override on already-constructed attrsets. CorePure adopts the same split for the same reason: a silent in-literal override would let one of two visibly distinct authoring intents win without telling the reader, while explicit // makes the override visible at the call site.

Authors who want override behavior write defaults // { field = newValue ; } rather than declaring field twice in one record literal.

Pipe Sugar

|> is the canonical function-composition operator inside CorePure expressions. It is left-associative and low precedence: above binding / assignment forms and below arithmetic, comparison, and boolean operators.

The parser desugars:

lhs |> rhs

to:

rhs lhs

Pipes do not add a CorePure AST node and do not change evaluator semantics. For example:

xs |> filter pred |> map f |> sum

desugars to:

sum (map f (filter pred xs))

Data-Last Stdlib

Stdlib functions intended for sequence processing must take their primary data argument last. This is a CorePure stdlib invariant, not a recommendation.

The initial closed stdlib is:

map       : (T -> U) -> List T -> List U
filter    : (T -> Bool) -> List T -> List T
zip       : List U -> List T -> List { fst: T; snd: U }
zipWith   : (T -> U -> V) -> List U -> List T -> List V
length    : List T -> Number
sum       : List Number -> Number
all       : (T -> Bool) -> List T -> Bool
any       : (T -> Bool) -> List T -> Bool
min       : Number -> Number -> Number
max       : Number -> Number -> Number
abs       : Number -> Number
clamp     : Number -> Number -> Number -> Number
concat    : List String -> String
toString  : Scalar -> String
joinWith  : String -> List String -> String

Scalar is the closed set String | Number | Bool. toString is identity on strings and deterministic on numbers and booleans.

For zip and zipWith, the piped list is the rightmost operand:

scores |> zipWith (score: weight: score * weight) weights

zip returns records because tuples are not part of the first CorePure surface. For scores |> zip weights, each result element has fst = score and snd = weight.

clamp is ordered as clamp lo hi value, preserving data-last use:

score |> clamp 0 1

Line joining is written with joinWith "\n" rather than a separate joinLines primitive. That keeps the initial closed stdlib smaller while preserving the common prompt/report use case.

Future stdlib additions must preserve data-last ordering. User-defined functions should follow the same convention when intended for pipe use, but the evaluator does not enforce this for user code.

String Interpolation

CorePure strings support Nix-style interpolation:

let name = "Julius" ;
let greeting = "Hello, ${name}!" ;

The parser desugars interpolation into ordinary string concatenation:

"foo${expr}bar"

becomes equivalent to:

concat ["foo", toString expr, "bar"]

Interpolation is parser sugar. It does not add a CorePure AST node and it has no distinct evaluator semantics.

Interpolation auto-stringifies only scalar values:

  • String, Number, and Bool interpolate directly through toString;
  • records, lists, null, and lambdas are type errors unless explicitly converted by a later serialization function.

This gives prompt and report authors useful ergonomics without committing Cortex to a generic Show mechanism for structured values.

Double-quoted strings support only these escapes in the first slice:

  • \n, \t, and \r for newline, tab, and carriage return;
  • \\ for a literal backslash;
  • \" for a literal double quote;
  • \${ for a literal ${ without starting interpolation.

Indented strings use Nix-style ''...'' syntax. The parser strips the common leading whitespace from all non-blank lines and ignores the first line if it is whitespace-only after the opening delimiter. This allows prompt templates to align with surrounding Wire source without carrying that indentation into the value.

Physical newlines inside an indented string are preserved after indentation stripping. A final newline before the closing delimiter is preserved if it remains after the delimiter line is removed.

Indented strings support only these escapes in the first slice:

  • ''${ for a literal ${ without starting interpolation;
  • ''' for a literal '';
  • ''\n, ''\t, and ''\r for newline, tab, and carriage return.

All other backslashes and single quotes inside indented strings are literal characters.

Layer Boundary

Interpolation is a CorePure feature, not a topology feature. A string such as "${node.output}" does not read a circuit port at elaboration time. Inside an input-dependent output equation, input ports are bound to runtime values, so interpolation works normally over those concrete values.

Worked example

let scoreThreshold = 0.7 ;

node classify
  <- evidence: EvidenceSet ;
  -> accepted: AcceptedSet = accepted ;
  -> rejected: RejectedSet = rejected ;
  -> summary: Report = ''
    Classification complete.
    Accepted: ${length accepted} items
    Rejected: ${length rejected} items
    Threshold: ${scoreThreshold}
  '' ;
  where let
    items = evidence.items ;
    accepted = items |> filter (x: x.score >= scoreThreshold) ;
    rejected = items |> filter (x: x.score < scoreThreshold) ;
  in
  { accepted = accepted ; rejected = rejected ; } ;

The summary output is the primary reason interpolation belongs in the first slice: typed values can build text outputs without unreadable manual concat and toString calls.

Alternatives considered

  • Execute Haskell directly. Rejected because even “pure Haskell” imports a large language, partiality hazards, and a host compiler boundary. Allowing IO would turn Wire source into arbitrary host authority.
  • Embed Nix itself. Rejected because Nix is dynamically typed, has its own import and path semantics, and would make Cortex depend on Nix language behavior rather than a small typed evaluator.
  • Require manual string concatenation. Rejected because text outputs and prompt fragments are a core use case. Without interpolation, real examples become noisy immediately.
  • Stringify every value. Rejected because records and lists need explicit serialization rules. Accidental structural stringification would become observable semantics.
  • Silently right-merge or left-merge duplicate field declarations in record literals. Rejected because it lets one of two visibly distinct authoring intents win without informing the reader, diverges from Nix’s attrset semantics, and diverges from CorePure’s existing duplicate-name discipline for let bindings, lambda parameters, and output equations. Override remains available through the explicit // operator.

Consequences

Positive

  • CorePure is expressive enough for scoring, filtering, summaries, and prompt/report fragments.
  • Pipes improve readability without changing the evaluator or proof surface.
  • Data-last stdlib ordering makes pipeline style predictable.
  • Interpolation is deterministic parser sugar over ordinary string primitives.

Negative

  • The parser must handle Nix-like strings and indentation normalization.
  • The evaluator needs first-class lambdas and partial application.
  • The stdlib must remain deliberately small; useful functions will need admission discipline.

Obligations

  • Desugar pipes and interpolation before CorePure AST construction.
  • Add evaluator tests for every typed failure mode.
  • Add budget tests for list operations and string concatenation.
  • Add parser tests for double-quoted strings, indented strings, interpolation, and escapes.
  • Keep toString constrained to scalar values until explicit structured serialization is designed.
  • Reject duplicate field declarations in record literals at parse time, where two declarations conflict when their paths are equal or when one path is a strict prefix of the other. This matches the duplicate-name discipline already enforced for let bindings, lambda parameters, and output equations.