ADR 0073: Package Distribution and Discovery

Status

Accepted (2026-03-31)

Context

Beamtalk's package infrastructure has matured through several ADRs:

Today, Beamtalk packages can be authored, built, and consumed via path and git dependencies. The question is: what comes next for package distribution, and when?

Additionally, the tooling layer (MCP, LSP) needs a way to discover API metadata (classes, methods, documentation) from installed packages — not just from stdlib's bundled corpus.json. As packages are extracted from stdlib (starting with HTTP, per ADR 0072), MCP loses visibility into their APIs unless a discovery mechanism exists.

What Needs Deciding

  1. Distribution strategy — how packages move from monorepo path deps to standalone distributable packages, and the migration path toward a registry
  2. Config schema — how users declare registry sources when the time comes
  3. API metadata for tooling — how MCP/LSP discover class and method information from installed packages
  4. CLI surface — packaging and eventual publish commands

Constraints

Decision

1. Three-Phase Distribution Strategy

Package distribution evolves through three phases, each triggered by actual need rather than speculative infrastructure:

Phase 1: Git Repositories with Tagged Releases (Now)

Packages extracted from the monorepo become standalone git repositories (e.g., packages/httpjamesc/beamtalk-http). Consumers use git deps — already supported by ADR 0070:

[dependencies]
http = { git = "https://github.com/jamesc/beamtalk-http", tag = "v0.1.0" }

GitHub Releases provide a landing page for each version with release notes. The git tag is what matters for resolution — the lockfile pins the exact commit SHA.

This requires zero new infrastructure. Git deps, lockfiles, and implicit fetch-on-build all work today. The only work is extracting packages/http into its own repo and tagging releases.

What Phase 1 gives us:

What Phase 1 does not give us:

What Phase 1 costs:

Phase 2: Hex-Compatible Registry (When External Users Arrive)

When the ecosystem has external contributors or enough packages that version constraint solving matters, add a hex-compatible static registry. This is a set of static files (protobuf index + tarballs) served from any HTTP endpoint — S3, GitHub Pages, or any static host.

Tarball format — hex-compatible, following the hex tarball specification:

{name}-{version}.tar
  VERSION              # "3" (hex tarball format version)
  metadata.config      # Erlang term file: package metadata
  CHECKSUM             # SHA-256 of contents
  contents.tar.gz
    beamtalk.toml                    # Package manifest
    src/*.bt                         # Beamtalk source files
    native/*.erl                     # Erlang FFI source files (if any)
    native/include/*.hrl             # Erlang headers (if any)

Beamtalk-specific files (.bt sources) live inside contents.tar.gz, which the hex tarball spec treats as opaque. Gleam uses the same approach to ship .gleam source in hex tarballs.

The metadata.config follows the hex tarball specification:

{<<"name">>, <<"http">>}.
{<<"version">>, <<"0.1.0">>}.
{<<"description">>, <<"HTTP client and server for Beamtalk">>}.
{<<"app">>, <<"beamtalk_http">>}.
{<<"build_tools">>, [<<"beamtalk">>]}.
{<<"requirements">>, #{
    <<"gun">> => #{
        <<"requirement">> => <<"~> 2.1">>,
        <<"optional">> => false,
        <<"app">> => <<"gun">>
    }
}}.

Compiled .beam files are not included — consumers compile from source (BEAM convention).

Registry structure:

registry/
  public_key                 # Signing verification key
  names                      # protobuf: package name list
  versions                   # protobuf: version metadata
  packages/<name>            # protobuf: per-package release info
  tarballs/<name>-<version>.tar

Consumer config — registry-based dependencies use a version string:

[dependencies]
http = "~> 0.1"

[repos.beamtalk]
url = "https://hex.beamtalk.dev"
public_key_path = "keys/beamtalk-hex.pub"

Immutability policy: Once a version is published, its tarball must not be modified or replaced. To address a bad release, use the retirement mechanism — never replace a published tarball.

Package retirement: beamtalk package retire http 0.1.0 --reason security --message "Use 0.1.1" marks a version as retired without removing the tarball. Retired versions are excluded from resolution but remain downloadable for existing lockfiles.

Tarball and registry tooling — pure Rust. Following Gleam's approach (which builds hex tarballs in ~200 lines of Rust in compiler-cli/src/publish.rs):

beamtalk package build
# => Created _build/http-0.1.0.tar

beamtalk package registry build registry/ --name=beamtalk --private-key=key.pem

Version constraint solving — PubGrub (Phase 2+). Registry deps will require a constraint solver. Beamtalk will use the PubGrub algorithm via the pubgrub Rust crate — the same algorithm used by Gleam (Rust), Dart/pub, and Elixir/Mix (via hex_solver). Gleam's compiler-core/src/dependency.rs is the reference implementation. Implementation details deferred to the Phase 2 implementation issue.

Native dep version conflicts. [native.dependencies] are resolved by rebar3, which uses depth-first "first wins" — not PubGrub. If two Beamtalk packages declare conflicting hex dep versions (e.g., gun = "~> 2.1" vs gun = "~> 2.0"), rebar3 may silently pick one. The Beamtalk resolver has no visibility into rebar3's choices. Mitigation: beamtalk build should compare declared native dep versions across packages and warn on potential conflicts before invoking rebar3.

Phase 3: Public hex.pm (When APIs Stabilise)

When package APIs are stable enough to commit to semantic versioning, publish to hex.pm directly. hex.pm has no gatekeeping — Gleam, LFE, Clojerl, and Efene all publish there. Gleam started at v0.2 as a one-person project.

[dependencies]
http = "~> 0.1"              # resolves from hex.pm (default registry)

This transition is not just a URL change. It requires:

The tarball format is the same as Phase 2 — what changes is the resolution source and publishing mechanism.

Why not hex.pm from the start? hex.pm versions are immutable. Once you publish http 0.1.0, that version number is burned forever. With APIs still changing significantly between releases, we'd either burn through version numbers rapidly or publish packages that mislead consumers about stability. Phase 1 (git tags) and Phase 2 (static registry) let us iterate freely. Move to hex.pm once the package APIs are stable enough to commit to semantic versioning.

2. Config Schema

Per-project: beamtalk.toml

The three dependency forms, introduced progressively:

[dependencies]
# Path dep (existing, ADR 0070)
utils = { path = "../my-utils" }

# Git dep (existing, ADR 0070 — Phase 1 distribution)
http = { git = "https://github.com/jamesc/beamtalk-http", tag = "v0.1.0" }

# Registry dep (Phase 2/3 — version string = registry lookup)
http = "~> 0.1"

For native hex deps from a private registry:

[native.dependencies]
cowboy = "~> 2.12"                                          # public hex.pm
beamtalk_http_native = { version = "~> 0.1", repo = "beamtalk" }  # private repo

Registry declaration: [repos]

Private hex repos are declared in a top-level [repos] section (Phase 2+). This section serves both Beamtalk package deps and native hex deps — the registry infrastructure is shared.

[repos.beamtalk]
url = "https://hex.beamtalk.dev"
public_key_path = "keys/beamtalk-hex.pub"      # relative to project root

This generates the corresponding {hex, [{repos, [...]}]} section in the rebar.config for rebar3.

Why [repos] not [native.repos]? The registry serves both Beamtalk packages and native hex deps. Putting it under [native] would imply it's only for Erlang dependencies.

Why per-project, not global config? Projects must be self-contained — cloning a repo must be enough to build it. Global config may be added later for auth tokens only.

Resolution priority

When resolving a Beamtalk package dependency:

  1. Path{ path = "..." } resolves to the local filesystem
  2. Git{ git = "...", tag/branch/rev = "..." } resolves from a git remote
  3. Registry"~> 1.0" resolves from the configured registry (default: hex.pm)

Path deps take absolute priority (development override). Git deps are pinned by lockfile with commit SHA. Registry deps are pinned by lockfile with version + checksum.

Lockfile format for registry deps

ADR 0070's lockfile uses [[package]] entries with url, reference, and sha fields (designed for git sources). Registry deps (Phase 2+) add a new entry kind:

[[package]]
name = "http"
version = "0.1.0"
source = "registry"
registry = "beamtalk"          # or "hexpm" for public hex.pm
checksum = "sha256:abc123..."

The source field disambiguates from git-source entries.

3. API Metadata for Tooling

Generated on build, not shipped in tarballs

beamtalk build already generates class_corpus.json into _build/dev/ for every compiled package (see build.rs:647-686). Currently this file contains: class name, superclass, methods (selector strings), is_sealed, is_abstract, and a doc field (currently always null). Richer metadata (class kind, method parameters, return types, doc comments) is a future enhancement — the corpus format is extensible.

Since ADR 0070 established implicit fetch-and-compile on build (the Cargo model), dependencies are always compiled before they can be used. The class_corpus.json is generated as a build artifact — there is no window where a dependency is resolved but not compiled.

API metadata is not included in the tarball. The tarball ships source; the corpus is derived data generated during compilation. This avoids two sources of truth and keeps tarballs minimal.

MCP discovery

MCP already discovers corpora from the _build/ tree (implemented in crates/beamtalk-mcp/src/server.rs), including _build/dev/ for the root package and _build/deps/*/ for dependencies (with fallback search paths). After beamtalk build, each package's class_corpus.json is discovered from these existing locations. No new MCP code is needed.

Visibility: Visibility is enforced at corpus generation time — the build step that creates class_corpus.json includes only public classes and their public methods. MCP consumes the generated corpus as-is. When the internal class modifier lands (ADR 0071), internal classes are excluded from generated class_corpus.json.

Remote package search (future)

Searching a registry for packages by class name, method selector, or protocol conformance requires server-side metadata indexing. This is not possible with a static hex registry — it would require a custom registry with a search API or a separate search index (analogous to docs.rs for Rust). Deferred until the ecosystem warrants it.

4. CLI Surface

# Phase 1: Build the package (compile + test)
beamtalk package build
# => Builds the package, generates class_corpus.json

# Phase 2: Build a hex-compatible tarball for registry publishing
beamtalk package build --tarball
# => Created _build/http-0.1.0.tar

# Phase 2: Regenerate static registry index
beamtalk package registry build registry/ --name=beamtalk --private-key=key.pem

# Phase 2: Retire/unretire a version
beamtalk package retire http 0.1.0 --reason security --message "Use 0.1.1"
beamtalk package unretire http 0.1.0

# Phase 3: Publish directly to hex.pm
beamtalk package publish

Future: reimplementation in Beamtalk. Once the package ecosystem and HTTP package are stable, package operations could be reimplemented as native Beamtalk expressions using hex_core (pure Erlang) at runtime — e.g., Package publish: "http", Package add: "json". This would enable a fully dynamic workspace where packages can be resolved, fetched, and loaded without leaving the REPL. The Rust CLI remains the baseline; the Beamtalk-native version is a convenience layer on top.

Prior Art

Go

Go modules use git repositories as the distribution mechanism. go.mod declares dependencies with module paths (typically GitHub URLs) and version constraints. No separate registry — the module proxy (proxy.golang.org) caches and indexes git-hosted modules. This validates the "git repos first, registry layer later" approach.

Adopted: Git repositories as the Phase 1 distribution mechanism. Tags as version markers.

Elixir / Hex.pm

The primary inspiration for Phase 2-3. Elixir publishes to hex.pm via mix hex.publish. The hex tarball format is well-specified. Private repos use mix hex.repo add with auth tokens. Hex.pm supports non-Elixir build tools — Gleam already publishes there with build_tools = ["gleam"].

Adopted: Hex tarball format (Phase 2), version constraint syntax, package retirement mechanism.

Gleam

Publishes to hex.pm using the same tarball format with build_tools = ["gleam"]. Critically, Gleam builds hex tarballs in pure Rust (~200 lines in compiler-cli/src/publish.rs) — it does not use hex_core, Mix, or any Erlang tooling for tarball creation. Uses the hexpm Rust crate for hex.pm API calls and prost for protobuf. Gleam started publishing to hex.pm at v0.2 (2019) as a one-person project, years before its 1.0 (2024).

Adopted: Pure Rust tarball creation, PubGrub resolver (pubgrub crate), source-in-tarball convention, hexpm crate for API interactions.

Rust / Cargo

Crates.io is the single registry. Cargo.toml declares dependencies by version constraint. Crates can be yanked (removed from resolution but still downloadable).

Adopted: Implicit fetch on build (ADR 0070). Short-form version strings for registry deps. Yank/retire semantics.

Smalltalk / Monticello

Smalltalk traditionally uses image-based distribution or Monticello/Metacello package specs. Metacello supports repository declarations pointing to HTTP servers.

Rejected: Image-based distribution doesn't fit the BEAM's file-based compilation model.

User Impact

Newcomer

Phase 1: Add a git dep to beamtalk.toml, run beamtalk build, use the classes. Simple but requires knowing the git URL.

Phase 2-3: Registry deps (http = "~> 0.1") are the simplest form — just a name and version string.

Package author

Phase 1: Push to GitHub, tag a release. Zero publishing infrastructure.

Phase 2: beamtalk package build --tarball creates a publishable artifact. Upload to static registry.

Phase 3: beamtalk package publish pushes directly to hex.pm.

Erlang/BEAM developer

The hex tarball format (Phase 2+) is standard. Private repos work with rebar3's existing {hex, [{repos, ...}]} config. Native deps resolve from hex.pm throughout all phases.

Tool author (MCP/LSP)

class_corpus.json is generated on build for every package, and MCP discovers it automatically from the _build/ tree (_build/dev/ for root, _build/deps/*/ for dependencies). No changes needed across any phase.

Steelman Analysis

"Just use hex.pm from the start"

hex.pm has no gatekeeping — Gleam started at v0.2 as a one-person project. Early publishing forces you to get versioning right and is what grew Gleam's ecosystem from zero to 1,300+ packages. LFE, Clojerl, and Efene all publish there too.

However: hex.pm versions are immutable. Once you publish http 0.1.0, that version number is burned forever. With APIs still changing significantly, we'd burn through version numbers or mislead consumers about stability. Git tags and a static registry let us iterate freely first.

"Build a custom registry, not hex-compatible"

A Beamtalk-specific registry could carry richer metadata natively (class kinds, protocol conformance, searchable method signatures) and implement features like class-level search.

However: hex compatibility is required for native deps anyway (rebar3 must resolve from the same registry), and the hex tarball format is extensible enough to carry Beamtalk-specific files inside contents.tar.gz.

"Skip the git phase — build the static registry now"

Building the registry infrastructure now would dogfood the full publish/resolve/fetch/compile cycle and exercise the hex integration.

However: git deps already provide that cycle. beamtalk build fetches git deps, compiles them, and puts them on the code path — the same pipeline a registry dep would use. The only thing a static registry adds over git deps is version constraint solving, which isn't needed with <10 packages and one consumer (the main project). Build the registry when there's actual demand — external contributors or enough packages that manual version coordination breaks down.

"Defer the whole thing — keep packages in the monorepo"

Path deps work. No extraction needed. Simpler build, simpler CI, single repo to manage.

However: path deps never test the real distribution pipeline. A package that works as a path dep in the monorepo may fail when consumed as a git dep (missing files, wrong paths, undeclared dependencies). Extracting to git repos forces the package to be self-contained and tests the consumer experience. The cost is low — it's just a new repo with a beamtalk.toml.

Alternatives Considered

Alternative A: Separate Beamtalk Package Registry

A custom registry server with Beamtalk-specific API for richer metadata queries.

Rejected because: it splits the ecosystem (native deps on hex.pm, Beamtalk packages elsewhere), requires building and maintaining registry infrastructure, and doesn't leverage existing hex tooling.

Alternative B: Static Registry from Day One

Skip git deps, go straight to a hex-compatible static registry with tarball publishing.

Rejected because: it front-loads ~600 lines of Rust tooling (tarball creation + registry generation + signing), registry hosting infrastructure, and a signing keypair — all for ~1 package with one consumer. Git deps provide the same distribution capability with zero new code. Build the registry when the ecosystem needs version constraint solving.

Alternative E: Publish from Monorepo (Cargo Workspace Model)

Keep packages in the monorepo (packages/http/) and add beamtalk package build --tarball to publish tarballs from subdirectories. No repo extraction needed. Cargo workspaces use this model — develop everything in one repo, publish individual crates to crates.io.

Rejected for Phase 1 because: git deps already support packages/http as a path dep, which doesn't test the consumer experience. Publishing a tarball from the monorepo would test the tarball pipeline but requires building the Phase 2 tarball tooling — the same ~200 LOC of Rust we're deferring. Extracting to a standalone repo tests self-containedness (missing files, undeclared dependencies) without any new tooling. May revisit for Phase 2 — building tarballs from a monorepo workspace alongside standalone repos is not mutually exclusive.

Alternative C: Global Config for Registry URLs

Store registry URLs in ~/.config/beamtalk/hex_repos.toml instead of per-project.

Rejected as primary mechanism because projects must be self-contained. Global config may be added later for auth tokens only.

Alternative D: BEAM Module Attributes Instead of class_corpus.json

Use __beamtalk_meta/0 function exports as the sole source of API metadata. MCP would call these on loaded modules dynamically.

Rejected because: MCP runs as a Rust process reading JSON files from _build/, not connected to a running BEAM. class_corpus.json is the build-time materialisation of module metadata into a format MCP can read statically.

Consequences

Positive

Negative

Neutral

Implementation

Phase 1: Git Distribution (Now)

  1. Extract packages/httpjamesc/beamtalk-http as standalone repo
  2. Set up CI in the new repo (fetch Beamtalk compiler, run BUnit + EUnit tests)
  3. Migrate tests that exercise HTTP classes to the new repo
  4. Tag releases (v0.1.0, etc.)
  5. Update consuming projects to use git deps
  6. Verify full fetch/compile/test cycle works from a clean checkout
  7. Add cross-repo CI: main repo periodically builds against latest tagged HTTP package

Phase 2: Hex-Compatible Registry (When Needed)

  1. beamtalk package build --tarball — pure Rust tarball creation (reference: Gleam's publish.rs)
  2. beamtalk package registry build — pure Rust static registry generation (prost, rsa/ring)
  3. [repos] section in manifest parser
  4. { version = "...", repo = "..." } table form for [native.dependencies]
  5. Generated rebar.config includes {hex, [{repos, ...}]} when private repos are configured
  6. Version constraint solver — pubgrub::DependencyProvider (reference: Gleam's dependency.rs)
  7. Lockfile extension with source = "registry" entries
  8. beamtalk package retire / beamtalk package unretire
  9. Set up static registry hosting

Phase 3: hex.pm (When APIs Stabilise)

  1. beamtalk package publish with hex.pm API integration (via hexpm Rust crate)
  2. Registry deps in [dependencies] resolve from hex.pm by default

Future: Beamtalk-Native Package Operations

  1. Reimplement package operations as Beamtalk expressions using hex_core (pure Erlang)
  2. Package publish: "http" — publish from the REPL
  3. Package add: "json" — dynamically resolve, fetch, compile, and load without editing beamtalk.toml

Affected Components

ComponentPhaseChanges
packages/http1Extract to jamesc/beamtalk-http
crates/beamtalk-cli/src/commands/manifest.rs2Parse [repos], table-form native deps with repo field
crates/beamtalk-cli/src/commands/build.rs2Generate {hex, [{repos, ...}]} in rebar.config
crates/beamtalk-cli/src/commands/package.rs (new)2package build --tarball, package registry build, package retire
crates/beamtalk-cli/src/commands/deps/2PubGrub-based resolution (pubgrub crate), lockfile extension
crates/beamtalk-mcp/src/server.rsAlready handles dependency corpora — no changes needed

Migration Path

Each phase adds new capabilities. The tooling is backwards-compatible, but consumers of first-party packages must update their beamtalk.toml when packages migrate between distribution phases:

Implementation Tracking

Epic: BT-1739 Issues: BT-1738 (Justfile/--app), BT-1740 (extract HTTP repo), BT-1741 (monorepo git dep), BT-1742 (cross-repo CI) Status: Planned

References