ADR 0080: Migrate Supervisor Lifecycle to Result

Status

Accepted (2026-04-16). Phase 0a probe (BT-1994) empirically selected Option 2 (hook extension) over Option 3 (synchronous helper process); see §Phase 0a Probe Outcome below and §Phase 1 for the updated prescription.

Context

ADR 0079 established Result(Actor, Error) as the return shape for registry-boundary operations on Actor: spawnAs:, spawnWith:as:, registerAs:, and named: all return Result(Self, Error). The rationale — that boundary operations legitimately carry multiple outcomes the caller needs to distinguish, and that tagged-tuple/Result returns are the BEAM-idiomatic answer at startup/registration boundaries — applies equally to supervisor lifecycle methods that currently raise.

Three groups of Supervisor and DynamicSupervisor lifecycle methods currently raise beamtalk_error on failure:

MethodCurrent signatureUnderlying OTP callFailure modes
Supervisor>>supervise (class)-> Supervisorsupervisor:start_link({local, Name}, ...){error, {already_started, Pid}}, {error, Reason}
DynamicSupervisor>>supervise (class)-> Selfsamesame
DynamicSupervisor>>startChild-> Csupervisor:start_child(Sup, []){error, Reason} (init crash, resource exhaustion)
DynamicSupervisor>>startChild:-> Csupervisor:start_child(Sup, [Args])same
Supervisor>>terminate:-> Nilsupervisor:terminate_child(Sup, ChildId){error, not_found}, {error, Reason}
DynamicSupervisor>>terminateChild:-> Nilsupervisor:terminate_child(Sup, ChildPid){error, not_found}, {error, Reason}

The current implementation is internally consistent — it either returns a tagged tuple on success or calls error(beamtalk_exception_handler:ensure_wrapped(Error)) — but it is inconsistent with ADR 0079's precedent and with the way ADR 0060 tells users to think about expected failure. A caller writing modern Beamtalk against the Actor API has to mix two error-handling idioms once a supervisor enters the picture:

// Actor API — Result-shaped (ADR 0079)
(Counter spawnAs: #counter)
  ifOk:    [:c | c increment]
  ifError: [:e | Logger warn: "could not register"]

// Supervisor API — raise-shaped (current)
sup := [WebApp supervise] on: Error do: [:e | ...]   // must bracket or let crash
sup terminate: StaleChild                             // raises if child already gone

ADR 0079's "Future Work" section explicitly committed to closing this gap with its own ADR, noting the {already_started, Pid} case in supervisor:start_link's return and the {error, not_found} case in terminate_child as genuine multi-outcome returns that raise-style flattens.

Current state

supervise is already idempotent. beamtalk_supervisor:startLink/1 pattern-matches {error, {already_started, Pid}} and returns the existing supervisor tuple — so callers of supervise today cannot distinguish "I just started it" from "it was already running." Only genuine start_link failures (Reason other than already_started) raise. This is good behavior; the migration preserves it.

startChild does not encounter already_started in practice. DynamicSupervisor uses simple_one_for_one, where children are anonymous and unnamed. {error, already_present} and {error, {already_started, Pid}} are returned only by supervisor:start_child/2 on non-simple supervisors with declared child-spec ids — a surface Beamtalk does not currently expose (see Future Work).

terminate: / terminateChild: is where not_found is the load-bearing case. Callers legitimately want to treat "child already gone" as success (idempotent cleanup) while letting real errors propagate. With raise-style, they must on: Error do: [:_e | nil] and risk swallowing unrelated failures.

Constraints

Decision

Migrate six methods from raise-on-failure to Result-returning. Adopt the idempotent-startup convention — "already running" is an Ok outcome — across all supervisor lifecycle methods, current and future.

API

Static supervisor (stdlib/src/Supervisor.bt):

class supervise -> Result(Supervisor, Error) =>
  (Erlang beamtalk_supervisor) startLink: self

terminate: aClass -> Result(Nil, Error) =>
  (Erlang beamtalk_supervisor) terminateChild: self class: aClass

Dynamic supervisor (stdlib/src/DynamicSupervisor.bt):

class supervise -> Result(Self, Error) =>
  (Erlang beamtalk_supervisor) startLink: self

startChild -> Result(C, Error) =>
  (Erlang beamtalk_supervisor) startChild: self

startChild: args -> Result(C, Error) =>
  (Erlang beamtalk_supervisor) startChild: self with: args

terminateChild: child :: C -> Result(Nil, Error) =>
  (Erlang beamtalk_supervisor) terminateChild: self child: child

Unchanged (deliberately):

MethodReturnRationale
current, which:Supervisor | nil / Actor | nilLookup; nil-on-miss matches OTP whereis semantics (ADR 0079)
children, countArray / IntegerInspection of already-valid supervisor handle
stop, killNilTeardown of own resource; let-it-crash applies (ADR 0079 teardown rule)

Idempotent-startup convention

Across every current and future supervisor lifecycle method, an operation is Ok when the caller's target end state is already in effect, regardless of whether this call or a prior one established it.

The rule is "does the target state hold now?" — not "did this call change the input?" This matters for the distinction between {already_started, Pid} and {error, already_present} in OTP's start_child/2:

Error is reserved for outcomes the caller cannot trivially ignore:

This convention is the load-bearing reason the migration is forward-compatible: it binds the meaning of Ok on supervisor methods to a precise property (target-state-holds) so future additions (see Future Work) can classify their outcomes consistently without reshaping existing methods.

Error variants

Errors are structured beamtalk_error values with a reason Symbol. The error shape matches ADR 0079's Actor errors so user code using (beamtalk_error <reason>) destructuring works uniformly across Actor and Supervisor boundaries.

MethodError reasonWhen
supervise#supervisor_start_failedstart_link returned {error, Reason} with Reason ≠ already_started
supervise#stale_handleSupervisor class gen_server unreachable (defensive)
startChild / startChild:#child_start_failedsupervisor:start_child/2 returned {error, Reason} — typically child init/1 crash
startChild / startChild:#stale_handleSupervisor pid is dead
terminate: / terminateChild:#stale_handleSupervisor pid is dead
terminate: / terminateChild:#terminate_failedsupervisor:terminate_child/2 returned {error, Reason} with Reason ≠ not_found

not_found from terminate_child is not an error — per the idempotent convention, it becomes Ok(nil).

Each error carries the originating class and selector (matching existing beamtalk_error:new/4 shape from ADR 0015) plus a human-readable message describing the underlying Reason for logging and REPL display.

REPL session

> app := (WebApp supervise) unwrap
 => #Supervisor<WebApp,_>

// supervise is idempotent — both calls Ok, both return the running supervisor
> (WebApp supervise) isOk
 => true

> pool := (WorkerPool supervise) unwrap
 => #DynamicSupervisor<WorkerPool,_>

> (pool startChild) ifOk: [:w | w ping] ifError: [:e | Logger warn: e message]
 => #ok   (side-effect: w ping evaluated)

// Misconfigured child — init raises
> (BrokenPool startChild)
 => Result error: (beamtalk_error child_start_failed)

// Terminate is idempotent — already-gone is Ok
> (app terminate: Counter) isOk
 => true
> (app terminate: Counter) isOk   // second call: already gone
 => true

// Stale handle
> pool stop
 => #ok
> (pool startChild)
 => Result error: (beamtalk_error stale_handle)

Prior Art

Erlang / OTP. supervisor:start_link/3, supervisor:start_child/2, and supervisor:terminate_child/2 all return tagged tuples:

Erlang does not raise; it is entirely tagged-tuple-shaped. This ADR adopts Erlang's shape and promotes it from raw tuples to Beamtalk's Result object per ADR 0076's FFI coercion policy.

Elixir. DynamicSupervisor.start_child/2 mirrors Erlang's return shape: {:ok, pid} | {:ok, pid, info} | {:error, reason}. Supervisor.start_link/2,3 likewise. Elixir users pattern-match on these in case blocks or chain with with. Our Result plays the with role via andThen: / ifOk:ifError: (ADR 0060).

Gleam. gleam_otp's supervisor API returns Result(Pid, StartError) where StartError is a sum type enumerating the OTP cases (Already_started(Pid), Init_timeout, Init_failed(Reason), etc.). Gleam's richer sum type is appealing but requires ADTs at the type level — Beamtalk's beamtalk_error with a Symbol reason covers the same ground at a coarser granularity. Gleam explicitly treats Already_started as a StartError variant (not Ok), diverging from our idempotent-startup convention. Gleam's choice is consistent with "literal transcription of OTP return shapes"; ours is consistent with "the ergonomic case users overwhelmingly want at supervisor-start boundaries."

Akka (Scala). ActorSystem.actorOf historically raised InvalidActorNameException on duplicate names. Modern Akka Typed (ActorSystem.systemActorOf) returns Future[ActorRef[T]] which completes with the new ref or fails with a well-typed exception. The Future + typed exception pattern is Akka's equivalent of a Result; our design lands similarly with synchronous Result values (BEAM doesn't need futures for same-node calls).

Pharo / Squeak. No direct analogue — Smalltalk process model lacks OTP-style supervision. The closest pattern is SystemNotification / Error signal-resume, which is exception-shaped. Smalltalk-pure thinking would argue for keeping raise; the BEAM-native argument for Result wins here on consistency with the surrounding Actor API.

Beamtalk precedent. ADR 0079 (Actor spawnAs:, named:) is the direct template. ADR 0076 (ok/error → Result at FFI boundary) establishes the conversion policy. ADR 0060 (Result type) provides the combinator surface.

User Impact

Newcomer (Python/JS background). Startup flows read consistently — every boundary operation returns something the caller branches on. REPL display shows Result error: (beamtalk_error child_start_failed) with a clear class/selector/message shape. Forgotten error handling surfaces as "you are calling increment on a Result, not a Counter" at the type checker or first send, not a silent crash. Learning cost: one extra method call (unwrap or ifOk:ifError:) in tutorial code, offset by a consistent error model across the language.

Smalltalk developer. Raise-shaped supervisor APIs will feel more natural on first contact — Smalltalk's error model is exception-shaped. The ADR 0060 framing (Result for expected failure categories; exceptions for programming errors) reframes this: supervisor-start failure is an expected category (config error, port in use) for which Result is the right tool. The migration brings supervisor APIs in line with the Actor API users already work with.

Erlang/Elixir developer. The mapping to OTP return shapes is direct and familiar. (sup supervise) ifOk: ... ifError: ... reads as a Beamtalk-flavoured case supervisor:start_link(...) of .... The idempotent-startup convention is Beamtalk-specific; it is explicitly documented and consistent with the Beamtalk idiom that supervise means "make sure this is running" (not "start this fresh"). Erlang veterans who specifically need the {already_started, Pid} distinction — a niche case in production, absent from Beamtalk's current API surface — get it back with the future Supervisor startChild: method (Future Work).

Production operator. beamtalk_error values with symbolic reasons are greppable in logs and aggregatable in metrics (error counters keyed by #supervisor_start_failed vs #child_start_failed vs #stale_handle). The with_live_supervisor layer continues to catch stale-handle OTP exits and translate them to structured errors — no new failure-mode surface. Release upgrades during the transition window carry the usual BEAM concern: if a running node has old user code calling supervise under the raise contract and new stdlib returns a Result, the user code's shape expectation is broken. The migration is a breaking change requiring a coordinated restart, not a pure hot-reload.

Tooling developer (LSP, IDE). Typed return signatures (Result(Supervisor, Error), Result(C, Error)) enable precise completions after .ifOk:ifError:, map:, andThen:, mapError:, unwrap. Type narrowing: (pool startChild) ifOk: [:c | c.method] ifError: [...]c narrows to the DynamicSupervisor's C type parameter, so LSP offers the right method set. Diagnostics can now warn on unchecked Result returns from supervisor methods (the same lint applies as for Actor spawnAs:).

Steelman Analysis

Option B: Distinguish already_started via Error variant

Why not chosen: In Beamtalk's current surface, supervise is the only lifecycle method that can hit {already_started, Pid} today, and no current caller — in-tree or (per the issue description) in beamtalk-exdura — depends on the distinction. That is "no existing demand," not "overwhelmingly prefers idempotent." The stronger argument is structural: forcing callers to handle Error(#already_running, sup) as a success case breaks the happy-path short-circuit at every use site (app := (WebApp supervise) unwrap blows up on the second call) and the ergonomic of supervise as "make sure this is running." The distinction is recoverable via a wasFreshlyStarted accessor on the returned supervisor (Future Work) if post-migration usage shows demand. If such demand materialises and the accessor is insufficient, Option B remains viable as an additive superviseFresh: variant without disturbing the primary method.

Option C: Carry wasFresh flag in Ok via a wrapper type

Why not chosen: The wrapper introduces asymmetry with other lifecycle methods. startChild has no equivalent flag (always fresh); terminate: has no equivalent (either gone or failed). Either the wrapper applies only to supervise — in which case the API has an unprincipled one-off — or it applies uniformly, which over-specifies the other methods. More importantly, when a future Supervisor startChild: arrives (Future Work item 1) it hits the same decision: does it return Result(ChildStart, Error) matching supervise, or Result(C, Error) matching DynamicSupervisor startChild? Either answer is API churn. Option A (plain Result, idempotent Ok) avoids the question entirely.

Option D: Partial migration — Result for terminate:/terminateChild: only, keep supervise/startChild raising

Why not chosen: This is the most defensible rejected option — the bulk of today's pain is terminate:/terminateChild: where not_found is genuinely load-bearing, and supervise is already idempotent at the runtime layer. Three arguments tip the decision to full migration. First, leaving supervise/startChild raising means a user writing a mixed flow (supervisestartChildterminate:) handles two error idioms per call site — the cost compounds with every future supervisor method we add. Second, when Future Work item 1 lands (Supervisor startChild: for non-simple supervisors, which does hit {already_started, Pid}), we would migrate that one too — a second migration with its own external coordination. Third, unwrap preserves the raise-at-boot option for D's advocates at one extra keyword per call site, so the ergonomic cost of full migration is minor. Partial migration remains a defensible choice; the decision is a call, not a forced answer.

Option Z: Status quo — keep all methods raising

Why not chosen: ADR 0079 already committed the direction — Actor spawnAs: returns Result — and the Supervisor API is the other half of the startup boundary. Leaving supervisors exception-shaped while actors are Result-shaped creates the two-idiom problem described in Context. The "let it crash at boot" argument is preserved via unwrap: (WebApp supervise) unwrap crashes loudly on failure, satisfying the operator/veteran concern while keeping the Result contract for callers who want to handle the error explicitly. The migration is a net win for consistency and costs one keyword at boot-time call sites.

Tension Points

Alternatives Considered

Alternative B: Distinguish {already_started, Pid} via Error variant

supervise returns Ok(sup) only when the call starts the supervisor fresh; returns Error(#already_running, sup) when the supervisor was already registered; returns Error(Reason) for genuine start_link failures.

class supervise -> Result(Self, Error) =>
  // Ok when this call started it
  // Error #already_running (with existing sup) when already registered
  // Error #supervisor_start_failed for real failures

Rejected because: Breaks the happy-path ergonomics for the overwhelmingly-common case (idempotent "start or get"). Every caller of supervise in examples/otp-tree, stdlib/test, and e2e fixtures would need to treat Error(#already_running) as success, effectively collapsing it back to Ok at every use site. See Steelman Analysis for the full argument.

Alternative C: Wrapper type carrying wasFreshlyStarted

Introduce a SupervisorStart value-object returned in the Ok variant:

class supervise -> Result(SupervisorStart, Error)
  // SupervisorStart carries .supervisor and .wasFreshlyStarted

Rejected because: Creates asymmetry with startChild and terminate:, and would force a second decision (preserve the wrapper? match the flat shape?) when future supervisor startChild: arrives. See Steelman Analysis.

Alternative D: Partial migration — only terminate:/terminateChild:

Migrate only the methods where not_found is the load-bearing multi-outcome case. Leave supervise and startChild raising.

Rejected because: Splits the supervisor API across two error idioms, defeating the consistency motivation. unwrap provides the raise-at-boot semantics partial-migration advocates want, so the cost of full migration is low.

Alternative E: Migrate to a bespoke SupervisorError sum type (Gleam style)

Replace Result(X, Error) with Result(X, SupervisorError) where SupervisorError is a sealed enum (#stale_handle, #already_started(Pid), #init_failed(Reason), etc.).

Rejected because: Beamtalk has no ADTs at the type level; the closest mechanism is class-per-variant, which is heavyweight for the value carried here. beamtalk_error with a Symbol reason and an optional data field (ADR 0015) covers the same ground with the machinery we already have. If a richer type emerges — pattern matching on SupervisorError variants with exhaustiveness checking — that is its own ADR, not a detour in this one.

Alternative F: Dual API during a deprecation window (supervise! alongside supervise)

Introduce a supervise! raise-form alongside a new supervise Result-form for one release cycle, letting external projects (beamtalk-exdura, beamtalk-symphony) migrate on their own schedule. Remove supervise! in the release after that.

Rejected because: The benefit (decoupling external project timelines) is lower than the cost of maintaining two parallel APIs and teaching both in documentation during the window. The unwrap escape hatch on the new Result-form already achieves most of what supervise! would provide (one keyword for raise-at-failure); adding a second named method for the same effect is pure duplication. If coordinating the external project transitions proves harder than expected, reconsider — this remains a cheap fallback.

Alternative Z: Status quo — keep raise

Retain the current behavior; document the raise cases on each method.

Rejected because: Inconsistent with ADR 0079's Actor API and with ADR 0060's Result convention for expected failures. See Steelman Analysis.

Consequences

Positive

Negative

Neutral

Implementation

Phase 0: Probes — validate two load-bearing runtime assumptions before touching stdlib

Two implementation questions must be resolved by probe commits before Phase 1 is scoped. Either answer is viable for the decision; we need the answer to pick the Phase 1 approach.

Probe 0a — FFI coercion vs the beamtalk_supervisor_new post-dispatch hook (BT-1542).

Today, beamtalk_supervisor:startLink/1 returns a bare 4-tuple {beamtalk_supervisor_new, ClassName, Module, Pid} on fresh start. The hook in beamtalk_class_dispatch.erl:113-120 inspects the class gen_server's {ok, Result} reply, sees the _new tag, rewrites to the standard {beamtalk_supervisor, ...} tag, and synchronously runs run_initialize/1 in the caller's process (a documented deadlock-avoidance guarantee from ADR 0059 / BT-1285).

Migrating startLink/1 to {ok, SupTuple} | {error, BtError} causes ADR 0076's FFI coercion in beamtalk_erlang_proxy:direct_call/3 to produce a Result tagged map for the class method body's return value. The class gen_server replies {ok, <Result tagged map>}. The existing hook's pattern match on {beamtalk_supervisor_new, ...} does NOT match a Result tagged map — so initialize: silently stops running and the returned tag is never rewritten. This is a correctness regression, not a cosmetic one.

Three candidate resolutions, each with real tradeoffs:

  1. Keep startLink/1 raising; wrap at the stdlib layer with Result tryDo:. No runtime change to startLink/1; supervise body becomes Result tryDo: [(Erlang beamtalk_supervisor) startLink: self]. This alone does NOT preserve the existing _new tag rewrite / initialize: hook. The class method's return value is the wrapped Result, which is what beamtalk_class_dispatch sees in the {ok, Result} gen_server reply — the hook's pattern match is against the outer Result map, not the inner {beamtalk_supervisor_new, ...} tuple. So option (1) is useful as a baseline for error-shape experimentation only: it changes the Error side to an Exception-wrapped value (carrying ensure_wrapped's class/selector/message) without solving the initialize regression. To preserve initialize:, option (1) must be paired with runtime work from option (2) or (3) — at which point it collapses into a variant of one of those.
  2. Teach the class_dispatch hook to inspect the Result tagged map. The hook pattern-matches a second case for Result ok: {beamtalk_supervisor_new, ...}, rewrites the inner tag, runs run_initialize/1 in the caller's process, and rewraps as Result ok: {beamtalk_supervisor, ...} before returning. Preserves all existing semantics (initialize runs in caller's process, error shape is structured Result error:). Concern: couples beamtalk_class_dispatch to Result's internal representation; leaks supervisor-specific logic into a generic dispatch layer.
  3. Run initialize: synchronously from startLink/1 via a short-lived helper process. Spawns a transient process to run run_initialize/1 before returning; caller waits for it. startLink/1 then returns {ok, {beamtalk_supervisor, ...}} (already normalized) and FFI coercion does the rest. Concern: changes the documented "runs in caller's process" guarantee (ADR 0059 / BT-1285) — downstream implications on which: and recursive-send deadlock avoidance are unknown; probably needs its own mini-ADR.

Probe 0a compares the runtime-bearing approaches (options (2) and (3)) empirically on the Supervisor>>supervise / startLink/1 path — the only path that triggers the beamtalk_supervisor_new post-dispatch hook. The probe commit implements each option and verifies, on a fresh start, that (a) the _new tag is rewritten to {beamtalk_supervisor, ...}, (b) class initialize: runs exactly once, and (c) the e2e supervisor test suite stays green. The chosen option is the one that achieves the above with the least collateral code. Option (1) can be evaluated separately for Error-side ergonomics on instance methods (startChild, terminateChild:) where no post-dispatch hook is in play.

Phase 0a Probe Outcome (BT-1994, 2026-04-16)

Result: Option (2) wins. Option (3) is disqualified by a deadlock that surfaces whenever initialize: sends a method-dispatching message to the just-started supervisor.

Both options were implemented on separate branches (BT-1994-probe-option-2, BT-1994-probe-option-3) with identical test coverage: EUnit for runtime-level behaviour, BUnit for stdlib wiring, and the e2e supervisor suite for end-to-end correctness. Both probes used a stdlib shim — class supervise => ((Erlang beamtalk_supervisor) startLink: self) unwrap — so the user-facing signature stayed Supervisor / Self, keeping existing callers working without Phase 1 stdlib migration.

PropertyOption 2 (hook extension)Option 3 (helper process)
EUnit (beamtalk_supervisor_tests)69/69 ✅72/72 ✅
EUnit (beamtalk_class_dispatch_tests)80/80 ✅80/80 (hook removed, so rewrap test subsumed)
BUnit (full suite, 199 files)1860/1860 ✅— (blocked by e2e deadlock, not rerun)
E2E (1118 cases)1118/1118 ✅67 failuressupervisor_initialize.btscript deadlocks
initialize: runs in caller's processYes (ADR 0059 / BT-1285 invariant preserved)No (helper process)
Collateral code (runtime)+60 LOC (hook matches two shapes)+120 LOC (helper, monitor, timeout, teardown)
Semantic riskLow — hook change is additive, idempotent, and pattern-localHigh — helper trades a known invariant for a known-concern one

Option 3 failure mode (empirical). When initialize: on E2EInitSupervisor calls sup which: E2EInitChildA, and when initialize: on E2EInitDynSupervisor calls sup startChild, the e2e harness hit exit:{timeout, {gen_server, call, [ClassPid, {method, startChild}]}}. Trace:

  1. REPL sends E2EInitDynSupervisor superviseclass_send_dispatch issues gen_server:call(ClassPid, {class_method_call, supervise, []}).
  2. The class gen_server runs the method body, which invokes startLink/1 inside the class gen_server's handle_call.
  3. Option 3's startLink/1 spawns a helper process and waits for it before returning — still blocking the class gen_server's handle_call.
  4. The helper runs initialize:, whose body sends sup startChild.
  5. beamtalk_message_dispatch:send({beamtalk_supervisor, ...}, startChild, [])beamtalk_dispatch:lookupbeamtalk_object_class:has_method(ClassPid, startChild)gen_server:call(ClassPid, {method, startChild}).
  6. ClassPid is the class gen_server from step (2), which is still blocked waiting for startLink/1 (step 3) to return. Deadlock.

This is precisely the concern the ADR raised in Phase 0a ("downstream implications on which: and recursive-send deadlock avoidance are unknown; probably needs its own mini-ADR"). The probe confirms the concern is not hypothetical: it is triggered by the existing e2e fixtures. A redesign that fully decouples the helper from the class gen_server's handle_call would require making supervise asynchronous — a far larger API change than Option 2 — or routing method resolution through an out-of-band channel that bypasses ClassPid entirely (untested, adds a second concurrency protocol). Neither is justified when Option 2 succeeds cleanly.

Option 2 behaviour (empirical). Hook at beamtalk_class_dispatch:class_send_dispatch/3 now matches two shapes on the class gen_server's {ok, Result} reply:

Both branches rewrite the inner tag to {beamtalk_supervisor, ...}, call beamtalk_supervisor:run_initialize/1 in the caller's process (preserving the ADR 0059 / BT-1285 guarantee), and return the normalised shape (bare tuple or re-wrapped Result map, matching what they received). Idempotent {beamtalk_supervisor, ...} returns and {error, ...} branches flow through unchanged — no initialize re-run.

The dual-shape match means Phase 1's migration to supervise -> Result(Supervisor, Error) will not require another hook change: both the pre-Phase-1 shim (unwrap in stdlib) and the post-Phase-1 typed signature (no unwrap, callers handle the Result) produce shapes the hook already handles.

Probe 0b — Result(C, Error) type substitution for DynamicSupervisor(C).

ADR 0079 / BT-1986 probed that Result(Self, Error) works in generic position. ADR 0080 proposes Result(C, Error) where C is the DynamicSupervisor's parametric type argument (ADR 0068). BT-1992 ("Thread receiver type arguments through Self substitution") should cover this, but C is not Self — it's a class-level type parameter. Probe 0b writes a minimal stdlib method class foo -> Result(C, Error) on DynamicSupervisor and a concrete subclass test, verifying the typechecker narrows C correctly at call sites. If it does not, a targeted typechecker extension is in scope (matching ADR 0079's scope for Self).

Probe 0b outcome (BT-1995): substitution works unchanged — no typechecker extension required. A probe method probeC -> Result(C, Error) on DynamicSupervisor(C) with a concrete subclass extending DynamicSupervisor(Counter) narrows to Result(Counter, Error) at the call site under the existing substitution machinery. Class-level type parameters flow through build_inherited_substitution_map (ADR 0068 Phase 1b, BT-1577) — the same path BT-1986 / BT-1992 extended for Self, applied independently here — which resolves C → Counter from the subclass's superclass_type_args. The probe method was deleted after validation; the permanent regression record is the unit test test_class_type_param_in_generic_return_narrows_to_concrete in crates/beamtalk-core/src/semantic_analysis/type_checker/tests.rs. Phase 2 stdlib migration can consume Result(C, Error) signatures directly.

Both probes are small (<1 day each) and their outcomes determine the exact shape of Phase 1.

Phase 1: Runtime — return Result-shaped values where direct coercion is safe

Phase 2: Stdlib signature migration

Phase 3: e2e, examples, and docs

Phase 4: External project migration

Critical path

Phase 1 → Phase 2 (blocks stdlib signatures on runtime returning Result) → Phase 3 (e2e exercises the full pipeline). Phase 4 can begin once stdlib is published but does not block the ADR's epic.

Affected components (summary)

Estimated total effort: M (4–6 small-medium Linear issues, plus coordinated external migration).

Migration Path

Mechanical migration. Each call site falls into one of three patterns.

Pattern 1: Boot-style "crash on failure"

// Before
app := WebApp supervise

// After
app := (WebApp supervise) unwrap

Use unwrap when failure should propagate as an exception (application boot, test setup where failure aborts the run, REPL quick exploration).

Pattern 2: Recoverable start

// Before
app := [WebApp supervise] on: Error do: [:e | Logger error: e. nil]
app isNil ifFalse: [app count]

// After
(WebApp supervise)
  ifOk:    [:app | app count]
  ifError: [:e | Logger error: e]

Or, for chained start-and-configure:

(WebApp supervise)
  andThen: [:app | Result ok: (app which: HealthCheck)]
  ifError: [:e | Logger warn: e message]

Pattern 3: Idempotent terminate

// Before — had to swallow Error because not_found raises
[app terminate: Counter] on: Error do: [:_e | nil]

// After — idempotent naturally: Ok(nil) whether fresh terminate or already gone
app terminate: Counter
// real failures still surface as Result error: (beamtalk_error terminate_failed)
(app terminate: Counter) ifError: [:e | Logger warn: e message]

Common migration gotchas

Future Work

This ADR deliberately narrows scope to the Result migration. Several OTP capabilities the current Beamtalk supervisor surface does not cover are candidate follow-up ADRs. Each is listed with the Result variants it will need, to validate that the idempotent-startup convention remains sufficient.

  1. Supervisor startChild: spec for non-simple_one_for_one supervisors — allow runtime addition of heterogeneous children to a one_for_one / one_for_all / rest_for_one tree. This is the case where {error, already_present} and {error, {already_started, Pid}} from OTP become load-bearing. Expected shape: Result(Actor, Error) with Ok(child) on fresh start AND on {already_started, Pid} (per the convention), and Error(#child_already_registered) for the distinct already_present case (spec registered, process dead — genuinely different from running). Establishes the need for restartChild: and deleteChild: (items 2–3).
  2. Supervisor restartChild: class — wraps supervisor:restart_child/2. Restart a child whose spec is registered but whose process has terminated. Shape: Result(Actor, Error).
  3. Supervisor deleteChild: class — wraps supervisor:delete_child/2. Remove a child's spec from the supervisor (only applicable when the child has terminated and will not be auto-restarted). Shape: Result(Nil, Error) with Ok(nil) when deleted or already absent.
  4. Anonymous supervisor startsupervisor:start_link/2 without {local, Name}. Enables per-tenant / per-request supervision trees where the same Supervisor subclass is instantiated multiple times. Requires a companion ADR on how auto-registration-by-class-name interacts with explicit anonymous starts. Shape: Result(Self, Error).
  5. Supervisor getChildspec: class — wraps supervisor:get_childspec/2. Inspection of a single child's spec. Low priority; niche.

None of these require changes to the methods migrated in this ADR — the idempotent-startup convention is the load-bearing guarantee that makes them additive.

Separately:

Implementation Tracking

Epic: BT-1993 — Migrate Supervisor Lifecycle to Result (ADR 0080) Status: Complete — all phases landed across BT-1994..BT-2001.

#IssuePhaseSummary
1BT-19940Probe FFI coercion vs beamtalk_supervisor_new post-dispatch hook — selected Option 2 (hook extension)
2BT-19950Probe Result(C, Error) typechecker substitution for DynamicSupervisor(C) — substitution works unchanged
3BT-19961Runtime: migrate startLink/1 to Result-shaped return
4BT-19971Runtime: migrate startChild/1,2 and with_live_supervisor/3 to Result-shaped returns
5BT-19981Runtime: migrate terminateChild/2 (both arities) with idempotent not_found on static path
6BT-19992Stdlib: update Supervisor.bt / DynamicSupervisor.bt return types and migrate BUnit tests
7BT-20003E2E + examples: migrate supervisor btscripts and examples/otp-tree to Result-shaped API
8BT-20013Docs: language-features supervision chapter + CHANGELOG + this Implementation Tracking section

Critical path: BT-1994 → BT-1996 → BT-1997 → BT-1998 → BT-1999 → BT-2000 → BT-2001. Parallelizable: BT-1995 alongside BT-1994 (independent probes). Runtime migrations BT-1996..BT-1998 landed sequentially but touch disjoint functions in beamtalk_supervisor.erl so later reordering would also have been safe.

Divergence from proposed Phase 2 signature: the proposal above (§Phase 1, §Phase 2) specifies Supervisor class>>supervise -> Result(Supervisor, Error) for the static path. Phase 2 shipped with -> Result(Self, Error) instead, unifying the static and dynamic paths and matching the ADR 0079 precedent for Actor registry operations (spawnAs:, named:). This means AppSupervisor supervise type-narrows to Result(AppSupervisor, Error) at the call site via the standard Self → concrete-subclass substitution. See stdlib/src/Supervisor.bt:90 for the shipped signature.

References