ADR 0034: Stdlib Self-Hosting — Moving Logic from Erlang to Beamtalk

Status

Implemented (2026-02-24)

Context

ADR 0007 established the compilable stdlib with @primitive/@intrinsic injection, enabling Beamtalk source files (.bt) to declare the user-facing API while delegating to Erlang runtime modules for implementation. This was the right bootstrap strategy: get the system working first, then progressively self-host.

Today the stdlib has 67 .bt files (~5,140 lines) backed by 57 Erlang modules (~12,700 lines) in the runtime. The split is:

CategoryExampleLines
Already pure BeamtalkTrue.bt, Boolean.bt, Number.bt, error subclasses~430
Thin @primitive stubs over ErlangCollection.bt, Dictionary.bt, TestCase.bt~4,700
Erlang-only (no .bt file at all)Future, FileHandle~500
Runtime infrastructure (must stay Erlang)dispatch, actor, bootstrap, class registry~8,000

Three problems motivate this ADR:

1. Algorithmic logic trapped in Erlang. Collection operations like collect:, select:, reject:, detect:, inject:into: in beamtalk_collection_ops.erl (108 lines), list operations like zip:, groupBy:, partition: in beamtalk_list_ops.erl (~150 lines of algorithmic code), and tuple unwrapping in beamtalk_tuple_ops.erl are pure functional logic with no dependency on Erlang BIFs. They could be expressed as Beamtalk methods, following the Smalltalk tradition where abstract collection operations are defined in terms of do:.

2. Missing .bt files for user-visible classes. Future (309 lines, returned by every async message send) and FileHandle (created by File open:do:) have no Beamtalk source. Users interact with these objects but they are invisible to the compiler, type checker, LSP, and documentation system.

3. Test assertions are opaque. assert:, deny:, assert:equals: in beamtalk_test_case.erl are pure logic (check a condition, raise an error) but live entirely in Erlang. Users writing tests cannot see or extend the assertion logic.

Constraints

Decision

We adopt a phased self-hosting strategy that progressively moves stdlib logic from Erlang to Beamtalk, following the Smalltalk/Pharo pattern of building higher-order operations on a minimal primitive surface. Each phase is gated on benchmarks confirming acceptable performance before proceeding.

Phase 1: Add Missing .bt Files (Future, FileHandle)

Add Future.bt and FileHandle.bt as @primitive-stub classes, making them visible to the compiler, type checker, LSP, and documentation.

/// A future represents an asynchronous computation result.
///
/// Every message send to an actor returns a Future. Use `await` to
/// block until the result is available, or `whenResolved:` to register
/// a callback.
///
/// ## Examples
///
/// ```beamtalk
/// future := counter increment.
/// future await.
/// // => 1
/// ```
sealed Object subclass: Future
  /// Block until resolved (30s default timeout).
  await => @primitive "await"

  /// Block until resolved or timeout (in milliseconds).
  await: timeout: Integer => @primitive "await:"

  /// Block until resolved with no timeout.
  awaitForever => @primitive "awaitForever"

  /// Register a callback for when the future resolves.
  whenResolved: block: Block => @primitive "whenResolved:"

  /// Register a callback for when the future is rejected.
  whenRejected: block: Block => @primitive "whenRejected:"

  printString -> String => @primitive "printString"

Note: This is the minimal @primitive stub for the existing Future API. The full Future class design — combinators (then:, all:, race:), auto-await semantics, and Duration integration — is a separate ADR scoped by BT-507.

/// A handle to an open file, available within a `File open:do:` block.
///
/// ## Examples
///
/// ```beamtalk
/// File open: 'example.txt' do: [:handle |
///   handle lines.
///   // => #("line 1", "line 2")
/// ].
/// ```
sealed Object subclass: FileHandle
  /// Return all lines as a List of Strings.
  lines -> List => @primitive "lines"

  printString -> String => @primitive "printString"

Phase 2: Add addFirst: and Self-Host Collection Protocol

Prerequisite: addFirst: — O(1) List Cons

Before self-hosting collection operations, add addFirst: to List.bt as a new @primitive that compiles to O(1) list cons:

// List.bt — new method
/// Prepend item to the front of the list. O(1).
///
/// ## Examples
///
/// ```beamtalk
/// #(2, 3) addFirst: 1.
/// // => #(1, 2, 3)
/// ```
addFirst: item -> List => @primitive "addFirst:"

This compiles to Core Erlang [Item|Self] — a single cons cell allocation. Combined with reversed (which compiles to lists:reverse/1), this enables the classic functional accumulator pattern: prepend during iteration, reverse at the end. This is exactly how Gleam implements map/filter in pure Gleam, and how Pharo's OrderedCollection accumulates results.

The codegen addition in primitives/list.rs is trivial:

"addFirst:" => {
    let p0 = params.first().map_or("_Item", String::as_str);
    Some(docvec!["[", p0.to_string(), "|Self]"])
}

Self-host Collection protocol

Move the abstract Collection operations from beamtalk_collection_ops.erl into pure Beamtalk on Collection.bt. With addFirst: available, do: is the true primitive boundary — all higher-order operations compose on it, exactly as in Pharo.

Primitive surface — two methods per concrete collection:

MethodStays primitiveReason
do:Yes — subclassResponsibility, overridden per concrete classEach concrete collection implements iteration via its backing BIF (lists:foreach, maps:foreach, etc.)
sizeYes — subclassResponsibility, overridden per concrete classBacked by erlang:length, maps:size, etc.

inject:into: becomes pure Beamtalk on Collection, built on do::

/// Fold collection from left with an accumulator.
inject: initial into: block: Block =>
  acc := initial.
  self do: [:each | acc := block value: acc value: each].
  acc

Everything above do: and size becomes pure Beamtalk.

Before (current — every method delegates to Erlang):

// Collection.bt today
collect: block: Block => @primitive "collect:"
select: block: Block => @primitive "select:"
reject: block: Block => @primitive "reject:"
inject: initial into: block: Block => @primitive "inject:into:"

After (pure Beamtalk on the abstract class):

// Collection.bt — self-hosted
// Accumulator-based operations (addFirst: + reversed for O(n) total)

/// Fold collection from left with an accumulator.
inject: initial into: block: Block =>
  acc := initial.
  self do: [:each | acc := block value: acc value: each].
  acc

/// Return a new list with block applied to each element.
collect: block: Block -> List =>
  (self inject: #() into: [:acc :each | acc addFirst: (block value: each)]) reversed

/// Return elements for which block returns true.
select: block: Block -> List =>
  (self inject: #() into: [:acc :each |
    (block value: each) ifTrue: [acc addFirst: each] ifFalse: [acc]
  ]) reversed

/// Return elements for which block returns false.
reject: block: Block -> List =>
  self select: [:each | (block value: each) not]

// Early-termination operations (do: with non-local return)

/// Return the first element for which block returns true, or nil.
detect: block: Block =>
  self detect: block ifNone: [nil]

/// Return the first element for which block returns true, or evaluate
/// noneBlock if no element matches.
detect: block: Block ifNone: noneBlock: Block =>
  self do: [:each | (block value: each) ifTrue: [^ each]].
  noneBlock value

/// Return true if block returns true for any element.
anySatisfy: block: Block -> Boolean =>
  self do: [:each | (block value: each) ifTrue: [^ true]].
  false

/// Return true if block returns true for all elements.
allSatisfy: block: Block -> Boolean =>
  self do: [:each | (block value: each) ifFalse: [^ false]].
  true

/// Return true if collection includes anObject.
includes: anObject -> Boolean =>
  self anySatisfy: [:each | each =:= anObject]

Performance characteristics:

Concrete collections keep their optimized overrides. List.bt, Set.bt, and Dictionary.bt all retain their @primitive overrides for collect:, select:, reject:, detect:, inject:into:, includes:, anySatisfy:, allSatisfy: — these are backed by BIF fast-paths (lists:map, lists:filter, lists:foldl, lists:any, etc.). The pure-BT versions on Collection serve as the default for user-defined collections that subclass Collection without overriding these methods.

Phase 3: Self-Host Algorithmic List/Tuple/Dictionary Operations

Move pure-logic operations from Erlang *_ops.erl modules into Beamtalk:

List.bt — algorithmic methods become pure BT:

/// Return the index of the first occurrence of anObject, or -1.
indexOf: anObject -> Integer =>
  index := 0.
  self do: [:each |
    (each =:= anObject) ifTrue: [^ index].
    index := index + 1
  ].
  -1

/// Iterate with both element and index.
eachWithIndex: block: Block -> Nil =>
  index := 0.
  self do: [:each |
    block value: each value: index.
    index := index + 1
  ]

Tuple.bt — unwrap operations become pure BT:

/// Unwrap an {ok, Value} tuple. Raises on {error, _} or non-ok/error tuples.
unwrap =>
  self isOk ifTrue: [self at: 2] ifFalse: [
    self isError
      ifTrue: [self error: "Called unwrap on error tuple: {self}"]
      ifFalse: [self error: "unwrap requires {ok, Value} or {error, Reason} tuple"]
  ]

/// Unwrap an {ok, Value} tuple, or return default.
unwrapOr: default =>
  self isOk ifTrue: [self at: 2] ifFalse: [default]

/// Unwrap an {ok, Value} tuple, or evaluate block with error reason.
unwrapOrElse: block: Block =>
  self isOk ifTrue: [self at: 2] ifFalse: [block value]

Note: The Erlang unwrap has three clauses: {ok, Value}, {error, Reason}, and any other tuple. The BT version preserves all three cases using isOk and isError checks, raising distinct errors for error-tuples vs non-ok/error tuples; implementations MUST preserve the original error details (for example, include the Reason or rethrow the original error) when handling {error, Reason} so debugging fidelity is not lost compared to the Erlang implementation.

Dictionary.bt — iteration-based methods become pure BT:

/// Evaluate block for each key-value pair, passing an Association.
do: block: Block -> Nil => @primitive "do:"

/// Evaluate block with key and value as separate arguments.
keysAndValuesDo: block: Block -> Nil =>
  self do: [:assoc | block value: assoc key value: assoc value]

Phase 4: Self-Host Test Assertions

Move assertion logic from beamtalk_test_case.erl into TestCase.bt:

/// Assert that condition is true.
assert: condition: Object -> Nil =>
  condition ifFalse: [
    self fail: "Assertion failed: expected true, got {condition}"
  ]

/// Assert that actual equals expected.
assert: actual equals: expected -> Nil =>
  (actual =:= expected) ifFalse: [
    self fail: "Expected {expected}, got {actual}"
  ]

/// Assert that condition is false.
deny: condition: Object -> Nil =>
  condition ifTrue: [
    self fail: "Denial failed: expected false, got {condition}"
  ]

should:raise:, fail:, runAll, run:, and test lifecycle (setUp/tearDown orchestration, test discovery) remain as @primitive — they require process spawning, try/catch infrastructure, and class reflection that the runtime provides.

Bootstrapping trust: Self-hosted assertions depend on ifFalse:, ifTrue:, string interpolation, and fail: all working correctly. Since fail: remains @primitive (Erlang-backed), the error-raising path is stable. The ifTrue:/ifFalse: methods on True.bt/False.bt are already pure Beamtalk and have been exercised since the earliest stdlib work. String interpolation (ADR 0023) is compiler-generated and tested independently. The risk of a silent assertion failure is low, but as a safeguard: existing Erlang-backed bootstrap tests (stdlib/bootstrap-test/*.btscript) continue to validate core primitives independently.

What Stays in Erlang

The following categories must remain as Erlang:

CategoryModulesReason
Object system infrastructurebeamtalk_actor, beamtalk_object_class, beamtalk_dispatch, beamtalk_primitive, beamtalk_message_dispatch, beamtalk_class_dispatchBeamtalk code runs on these — bootstrap paradox
Class registry & instantiationbeamtalk_class_registry, beamtalk_class_instantiation, beamtalk_dynamic_objectRequired before any .bt module loads
Bootstrap stubsbeamtalk_bootstrap, beamtalk_behaviour_bt, beamtalk_class_btNeeded before stdlib compiles
Future state machinebeamtalk_future (internal process loop)Raw BEAM process, not a gen_server; process-level message receive
Exception handlingbeamtalk_exception_handlerCalled by compiler-generated try/catch
BIF wrappersbeamtalk_string_ops (grapheme), beamtalk_regex_ops, beamtalk_file_ops, beamtalk_json_ops, beamtalk_system_ops, beamtalk_random_ops, beamtalk_character_opsFundamentally wrap Erlang/OTP modules — all renamed to _ops suffix to signal they are primitive implementation modules
Concrete collection primitivesdo:, size per concrete class; all BIF-backed overrides on List/Set/DictionaryBack the primitive surface; BIF fast-paths for concrete types
OTP infrastructurebeamtalk_runtime_app, beamtalk_runtime_sup, beamtalk_stdlibSupervision, module loading

Naming convention: All Erlang modules that exist solely to wrap primitives or BIFs for use by .bt files must use the _ops suffix (e.g., beamtalk_list_ops, beamtalk_regex_ops). This makes the boundary between pure-BT logic and primitive Erlang implementation visually clear in file listings, stack traces, and code review. Modules without _ops are infrastructure (actor loop, bootstrap, OTP app/supervisor) that Beamtalk code does not call directly.

Prior Art

Pharo/Squeak Smalltalk

The collection protocol is almost entirely pure Smalltalk layered on do:. The abstract Collection defines collect:, select:, reject:, detect:, inject:into: all in terms of do:. The species pattern (species returns the appropriate result class; copyEmpty uses it) ensures collection-returning operations produce the correct type. What stays in the VM: object allocation, dispatch, and Array>>do: as a primitive. Everything above is pure Smalltalk.

What we adopt: do: as the primitive boundary; all higher-order operations as pure Beamtalk on the abstract class. addFirst: + reversed replaces Pharo's OrderedCollection>>addLast: for O(n) accumulation.

What we adapt: We don't implement species — abstract Collection methods return List unconditionally. Concrete classes override with optimized primitives when needed.

Newspeak

Takes self-hosting further than Pharo: the VM provides only allocation, dispatch, and I/O. Compiler, IDE, testing framework, regex — all written in Newspeak. Platform modules implement a fixed primitive interface; all shared logic is pure Newspeak.

What we adopt: The aspiration of minimizing the Erlang surface to a fixed primitive interface.

Gleam

Uses @external(erlang, "module", "function") only for length, reverse, flatten. All higher-order operations (map, filter, fold, find, any, all, partition) are pure tail-recursive Gleam with a final lists:reverse. The dual-target design (Erlang + JavaScript) forces minimal @external use.

What we adopt: The principle that @external/@primitive should be the minimum for correctness, not performance. Pure-language implementations are preferred.

What we match: With addFirst: (O(1) cons) + reversed, we use exactly the same pattern as Gleam: prepend during iteration, reverse at the end.

Elixir

Enum is entirely Elixir. It delegates via the Enumerable protocol, which for List ultimately calls :lists.foldl. The compiler stays in Erlang; an elixir_bootstrap.erl provides stubs for def/defmodule/@ so Kernel.ex can load.

What we adopt: The bootstrap-stub pattern (already used: beamtalk_behaviour_bt.erl, beamtalk_class_bt.erl).

Ruby (YJIT)

Ruby 3.4 moved Array#each, Array#map, Array#select from C to pure Ruby (under YJIT). The reason: C↔Ruby boundary prevents inlining. Pure Ruby lets YJIT inline the entire call including the block.

What we observe: The primitive boundary is not fixed — as compilation quality improves, more can move to the language. Today's @primitive overrides on List can be reconsidered if the BEAM JIT improves.

User Impact

Newcomer: No change to the API. Collection operations work exactly as before. The benefit is indirect: Future.bt and FileHandle.bt become visible in docs, LSP completion, and class reflection.

Smalltalk developer: The stdlib structure now matches Pharo: abstract Collection defines higher-order operations in terms of do:/inject:into:, concrete classes override where performance demands. This is the canonical Smalltalk pattern and will feel immediately familiar.

Erlang/BEAM developer: Concrete collection operations (List collect:, List select:) still compile to lists:map, lists:filter — the same BIFs they'd use in Erlang. The abstract fallbacks use message sends, which is slightly slower but only triggers for user-defined collections.

Production operator: No change to runtime behavior for existing code. The dispatch path is identical for primitive-backed methods. Pure-BT methods on the abstract class add one message-send layer vs direct BIF call, but only for non-List/Set/Dictionary collections.

Tooling developer: Future.bt and FileHandle.bt enable LSP completion, go-to-definition, type inference, and documentation generation for two previously invisible classes.

Steelman Analysis

Alternative: Keep Everything in Erlang (Status Quo)

Alternative: Aggressive Self-Hosting (Move Everything Possible)

Tension Points

Alternatives Considered

Alternative A: Full Self-Hosting Including Concrete Collections

Move List collect:, List select:, etc. from @primitive to pure Beamtalk, eliminating the Erlang beamtalk_list_ops.erl module entirely.

Rejected because:

Alternative B: Only Add Missing .bt Files, No Self-Hosting

Add Future.bt and FileHandle.bt but don't move any logic from Erlang to Beamtalk.

Rejected because:

Consequences

Positive

Negative

Neutral

Implementation

Phase 1: Missing .bt Files (Small)

Phase 2: addFirst: + Collection Self-Hosting (Medium)

Phase 3: Algorithmic Operations (Medium)

Phase 4: Test Assertions (Small)

Migration Path

No user-facing migration needed. All method signatures and behavior remain identical. The change is internal: implementation moves from Erlang to Beamtalk. Concrete collection performance is unaffected (BIF overrides retained).

Testing strategy:

References

Implementation Tracking

Epic: BT-812 Status: Planned

IssuePhaseTitleSizeDeps
BT-8131Add Future.bt and FileHandle.bt as @primitive stubsS
BT-8142aAdd addFirst: O(1) list cons primitive to List.btS
BT-8152bSelf-host abstract Collection protocol in pure BeamtalkMBT-814
BT-8163Self-host List algorithmic operations (indexOf:, eachWithIndex:)SBT-815
BT-8173Self-host Tuple unwrap operations (unwrap, unwrapOr:, unwrapOrElse:)SBT-815
BT-8183Self-host Dictionary keysAndValuesDo: and Association formattingSBT-815
BT-8194Self-host TestCase assertions (assert:, deny:, assert:equals:)S