ADR 0047: Return Type Arrow Token Disambiguation

Status

Implemented (2026-03-06) — TokenKind::Arrow added in PR #1190 (BT-1141)

Context

Beamtalk uses -> as the return type separator in method signatures (introduced in ADR 0025):

balance -> Integer => self.balance
deposit: amount: Integer -> Integer => ...

The -> characters are currently tokenised as BinarySelector("->") — the same token kind used for any other binary operator. The parser distinguishes the return-type use from a binary message send via a lookahead function (is_return_type_then_fat_arrow) that checks whether -> Identifier => follows the current position.

This lookahead fails in at least two cases discovered during BT-1003. This ADR addresses the first:

Ambiguity 1 — -> as method selector

The Object.-> method (Association creation) cannot be annotated with its return type:

// Intended: binary method `->`, arg `value`, return type `Association`
-> value -> Association => @primitive "->"

// Parser sees: binary method `->` with TWO args — `value` and `Association`
// Codegen produces class_->/3 (self + value + Association); `Self` is unbound.

The root cause: the selector token -> and the return-type separator token -> are indistinguishable. The lookahead cannot determine which -> opens the return type.

Ambiguity 2 — class as method name vs. class-side modifier (out of scope)

A second ambiguity exists: the Class.class method (returns the metaclass) cannot be annotated because class is both a valid method name and the keyword introducing class-side method definitions. This is a deeper structural issue — Beamtalk's class modifier is a syntactic convenience that has no equivalent in Smalltalk (where (MyClass class) >> #myMethod is a plain message send, not modifier syntax). Addressing it properly requires reconsidering the class-side method definition syntax; that decision is the subject of ADR 0048.

Current workaround

Object.-> is left unannotated (return_type: None in generated_builtins.rs), which reduces return-type coverage for chain resolution in the REPL (ADR 0045 Phase 1).

Constraint

The -> syntax for return types is well-established in the codebase (~590 annotated methods across stdlib), well-precedented in comparable languages (Gleam, Rust, Swift), and is the right choice semantically. The fix must not change user-visible syntax.

Decision

Give -> its own TokenKind::Arrow variant in the lexer, separate from BinarySelector. In return-type position -> is structural punctuation in a method signature — no operands, no precedence — while in expression position (x -> y) it remains a binary message send. Classifying it as BinarySelector conflates these two roles; a dedicated token lets the parser resolve the role from position alone. The parser is then updated to handle Arrow in all positions where -> currently appears — both as a return-type separator and as a binary method selector — using the dedicated token to disambiguate.

Lexer change

The lexer emits TokenKind::Arrow whenever it sees the two-character sequence ->, instead of wrapping it in BinarySelector("->"). Arrow is never a BinarySelector.

Parser change — return type (no change to user syntax)

parse_optional_return_type() matches TokenKind::Arrow instead of BinarySelector("->"). Behaviour is identical; only the internal token kind changes.

Parser change — binary method selector with Arrow

parse_method_selector() is extended to recognise TokenKind::Arrow as a valid binary method name. When the current token is Arrow the method selector is "->". After consuming Arrow and the argument name, parse_optional_return_type() is called as normal. The two consecutive Arrow tokens are now unambiguous:

Arrow  argName  Arrow  TypeName  FatArrow
  ↑ selector      ↑ return type

This correctly handles:

// Arrow selector, Arrow return-type separator — now unambiguous
-> value -> Association => @primitive "->"

Result

After this change, Object.-> can carry a return type annotation:

-> value -> Association => @primitive "->"      // Object — Association creation

Class.class remains unannotated pending ADR 0048. No user-visible syntax change. All existing annotations continue to work identically.

Prior Art

Gleam (BEAM language): Uses -> as the return type separator in function signatures: fn double(x: Int) -> Int. Gleam does not have user-defined binary operators, so -> cannot be a function name — no ambiguity possible.

Rust: Uses -> for function return types: fn abs(x: i32) -> i32. Custom operators via impl Trait but not free binary -> — no ambiguity.

Newspeak: Uses angle-bracket type annotations <Type> in method signatures. No separator token that could double as a method selector.

Pharo/Squeak: No return type annotations in the core language. Type annotations are layered via pragmas (<return: #Integer>) or external tools — avoids the grammar problem entirely at the cost of out-of-band syntax. Notably, Pharo has no class modifier keyword: class-side methods are defined by sending class to the class object to navigate to its metaclass ((MyClass class) >> #myMethod), then defining a method on that metaclass via >>. class has one meaning throughout — a unary message returning the receiver's metaclass. The Ambiguity 2 collision is absent because Smalltalk has no modifier syntax at all.

Swift: Uses -> for return types. Operator overloading exists but -> is not overloadable — reserved entirely for the type position.

The common pattern among BEAM and modern languages is to reserve -> exclusively for type annotations. Beamtalk's Smalltalk heritage of arbitrary binary selectors is the unique complicating factor; the dedicated-token approach resolves it without abandoning either feature.

User Impact

Newcomer: No visible change. The -> syntax they learned from examples continues to work. The error messages they would previously receive from accidentally writing -> value -> Type are now gone — the code just works.

Smalltalk developer: The -> binary message (Association creation) is a familiar idiom. Keeping it working while also allowing return type annotation is the correct outcome. No regression to Smalltalk mental model.

Erlang/BEAM developer: Unaffected. The generated Core Erlang is identical; the fix is purely at the parse layer. Erlang -spec annotations generated from return types become more complete (previously-unannotatable methods now carry specs).

Tooling developer (LSP/IDE): Positive. The LSP can now provide return type information for Object.->. Class.class annotation is pending ADR 0048. The dedicated Arrow token also simplifies syntax highlighting rules (one token kind for type arrows, one for binary selectors).

Production operator: No runtime change. Pure compile-time fix.

Steelman Analysis

Option B: Change separator to ^ (return keyword)

Option D: Parser-only fix (combined)

Option C: Minimal workarounds per method

Tension points

Newcomers and operators both lean toward Option C (minimal change, minimal risk). Language designers and LSP developers prefer the dedicated Arrow token approach (correct fix, complete coverage, accurate representation). The tiebreaker: the Arrow token approach is contained to ~50 lines in the parser and lexer, with no observable behaviour change for any user.

Alternatives Considered

Alternative B: Change return type separator to ^

Replace -> with ^ for return type annotations throughout. See Steelman Analysis. Rejected: visual confusion with early-return ^ in method bodies; migration cost.

Alternative C: Leave specific methods unannotated

Accept Object.-> as permanently unannotatable. Document the limitation. Remove -> from Object as part of BT-1017 to eliminate ambiguity 1. Rejected: incomplete fix; leaves the grammar with a known, non-obvious limitation. Class.class is addressed structurally by ADR 0048, not by acceptance.

Alternative D: Parser-only fix (count arrows in context)

Solve ambiguity 1 without a lexer change: in parse_method_selector, when the parser sees BinarySelector("->") it knows this is the selector. After consuming the parameter name, a second BinarySelector("->") must be the return type separator (binary methods take exactly one parameter). This is deterministic.

Rejected: solves ambiguity 1 but not ambiguity 2 (class keyword). More fundamentally, the return-type -> is not a binary operator — it is structural punctuation with no operands or precedence. Keeping it as BinarySelector is a misclassification that all downstream consumers must compensate for individually. See Steelman Analysis.

Alternative E: Parenthesise return types to disambiguate

-> value (-> Association) => @primitive "->"

Rejected: syntactic noise; inconsistent with all existing annotations; leaks the parser's internal ambiguity into user-visible syntax.

Consequences

Positive

Negative

Neutral

Implementation

Phase 1 — Lexer (token.rs, lexer.rs):

Phase 2 — Parser declarations (declarations.rs):

Phase 3 — Parser expressions (expressions.rs):

Phase 4 — Stdlib annotations:

Implementation scope: The codebase has ~25 references to BinarySelector across 5 files (token.rs, lexer.rs, declarations.rs, expressions.rs, mod.rs). Of these, 6 sites specifically match BinarySelector(s) if s == "->" and must be converted to match Arrow; ~8 generic BinarySelector(_) sites must add Arrow alongside. Comma-handling sites (BinarySelector(s) if s == ",") and the negative-number site (BinarySelector(op) if op == "-") are unaffected. The compiler enforces exhaustive matching, so newly-added TokenKind::Arrow arms will be flagged at compile time.

Phases 1–3 are a single atomic change and must ship together; an Arrow token emitted by the lexer without updated parser handling is a regression, not a partial deliverable. Phase 4 (stdlib annotations) can follow in a subsequent commit.

Affected components: Lexer (lexer.rs, token.rs), parser (declarations.rs, expressions.rs), stdlib sources, generated_builtins.rs. No codegen, runtime, or REPL changes needed.

References