Selective .tydb reads and deferred back-pass codegen#2930
Open
plajjan wants to merge 25 commits into
Open
Conversation
added 25 commits
June 15, 2026 15:06
Every .tydb access opened its own LMDB environment, ran one transaction, and closed it again, with a per-path mutex keeping LMDB's one-environment-per-process rule satisfied. That makes each read pay an environment open and an mmap, and it leaves no room for a reader that outlives a single call. Cache one read-only environment per path in a process-wide registry instead. Readers claim the cached environment, run their short read transaction on it, and release it; the registry counts active readers. Exclusive users - writers and environment copies - keep the per-path lock and retire the cached environment before opening their own, waiting for active readers to drain, so the process still never holds two environments for one path. In-process readers therefore block during a write exactly as before, while readers in other processes are protected by LMDB's reader table, which every environment participates in. A transient lock-table race, or a map grown by another process, retires the cached environment and retries on a fresh one. The binding adds MDB_NOTLS implicitly, so reader slots follow transactions rather than the transient bound threads they run on.
A module that imports a large dependency decodes the dependency's whole .tydb interface even when it uses a handful of names. The on-disk layout already keys every NameInfo entry individually, but there is no read API for exact lookups, and no stored answers for the solver's module-wide questions: which constructors carry an attribute, which types sit below a constructor, which extensions implement a protocol or extend a type. Store narrow per-module query indexes at write time, derived from the public interface. "public-names", "constructors", and "actors" record name lists in TEnv order, while "con-attr/<name>", "proto-attr/<name>", "descendants/<qname>", "ext-proto/<qname>", and "ext-type/<qname>" map one query key to the public names that answer it. QName keys hash a source-location-free encoding so the same semantic name always maps to the same key. The descendants index covers classes and protocols; readers filter by NameInfo kind. On the read side, add InterfaceDB: a handle whose path is validated once and whose lookups each run a short read transaction on the shared per-path environment. readInterfaceDBNameInfoMaybe decodes exactly one name-info row, and the readInterfaceDB* index readers resolve one index row plus the name-info entries it points at. Nothing uses the new API yet; the .tydb version is bumped for the new keys.
Imported modules live in the env as NModule TEnv trees plus a hashed mirror, so every per-module view is derived from a fully decoded interface. To read interfaces selectively, the env needs a module representation that answers questions through functions instead of exposing a materialized TEnv. Introduce ModuleInfo as that per-module lookup view: recorded imports and docstring, an exact name-lookup function, public names, constructors, actors, attribute owners, descendants, and extension witnesses keyed by protocol and by type. mkModuleInfo backs the view with an in-memory interface TEnv, building lazy maps for the index queries; mkTyFileModuleInfo backs it with keyed InterfaceDB reads, hidden behind pure-looking functions and memoized per handle through memoLookup, so a demanded name or index row is read at most once. Store one ModuleInfo per module in a new map in EnvF, populated alongside the existing modules/hmodules TEnv storage, which stays the active lookup path for now. addMod fills both representations, mapModules rebuilds the ModuleInfo of every converted module, and findModuleInfo/lookupModuleInfo mirror the alias-aware and direct module lookups. While compiling __builtin__ itself the module is its own environment, so its view resolves names through the active hnames index instead of a TEnv copy.
Qualified-name resolution walks the NModule tree through findHMod and lookupHMod, which presumes a fully materialized per-module TEnv for every loaded module. Switch the point lookups over to the ModuleInfo map: tryQName and QName unaliasing resolve one (module, name) pair through moduleLookupHName, and transitiveImports follows each module's recorded import list instead of decoding NModule nodes. Both representations are built from the same interface data, so behavior is unchanged; the old tree lookups remain for their remaining callers until those are converted. One thing the tree provided implicitly was parent packages: importing a.b created an "a" node, so module-path detection could treat "a" as a module prefix. The flat map needs that spelled out, so isMod and ModName unaliasing consult modulePrefix, which checks whether any loaded module name extends the queried path.
A from-import resolves its names by searching the imported module's public TEnv, which forces the whole interface even for a single imported name. Route imports through ModuleInfo instead. importSome looks up exactly the requested names and aliases them; importAll enumerates the recorded public-name list and performs one exact lookup per importable entry, since import * is the one importer that genuinely asks for everything. doImp returns the imported module's ModuleInfo, and a module already loaded in the env restores its transitive import closure from the recorded import list without touching the .tydb. Completion's shallow import fallback follows the same path. Because the lookup functions only expose public names, a selected import of a private name now fails with NoItem at the lookup itself, matching how qualified access to private names already fails.
allDescendants answers "which types sit below this constructor" by scanning every constructor of every transitive import and testing its ancestry, forcing all imported interfaces for one solver query. Read each imported module's descendants index for the queried constructor instead: the writer recorded, for every public class, the class names sitting below each of its ancestors, so the reader resolves one index row per module plus the name-info entries it names. Local definitions keep the ancestry scan over localCons, since they are not covered by any module index. importedModuleInfos centralizes the per-transitive-import ModuleInfo enumeration for this and the following query conversions.
allConAttr and allPConAttr collect the constructors that expose an attribute by testing every imported constructor with hasAttr, an ancestry-aware scan that forces all imported interfaces whenever the solver ranks a selection constraint. Read the per-module attribute indexes instead. The indexes record attributes where they are declared, so declared owners alone would miss imported types that only inherit the attribute. Complete the result with the descendants of every declaring constructor — having the attribute means having a declaring ancestor — using class descendants for allConAttr and protocol descendants for allPConAttr. This preserves the ancestry-aware semantics of the old scan while reading only rows keyed by the queried attribute name. Local definitions keep the hasAttr scan.
allActors filters every imported constructor for actors when the solver searches Identity-protocol candidates, which forces all imported interfaces for a query whose answer is usually tiny. Concatenate each imported module's compact actor record instead, so the cost follows the actor count rather than the module size. allTypes stays a deliberately broad enumeration — it remains for callers that really ask for every imported type — but it reads constructor records through ModuleInfo so it no longer depends on the materialized module TEnvs.
Every type check starts by folding every transitive import's full interface into the type-check env: setupWits turns each imported extension into a closed witness and setupCons registers each imported constructor in the tyids/tyinfos lattice. This preload is the largest forced materialization of imported interfaces and scales with dependency size rather than with what the module being checked actually uses. Drop the preload and merge imported witnesses per query instead. witsByPName and witsByTName concatenate the local witness tables with each imported module's extension-index reads, keyed by the queried protocol or type name. queryQNames tries both the queried and the unaliased name form, since interfaces record references in either, and uniqueWits collapses duplicates the way the insert-time check used to. Solver behavior is otherwise unchanged because findWitness, allBelowProto, and findProtoByAttr already funnel through these two functions. The tyids/tyinfos lattice now covers local definitions only, so constructor and type-variable registration tolerates ancestors that are not registered: addconinfo keeps only known ancestor ids and addvarinfo falls back to an empty entry for an imported bound. The lattice's attribute and subtype sets feed nothing but the debug printer today, so omitting imported ancestors changes no behavior.
A few consumers outside the type checker still reach into the materialized module TEnvs: code generation's provider-object scan, doc output, root-actor eligibility checks, and completion's qualified-name lookup. Convert them to ModuleInfo. classQBinds and completion resolve one exact name; doc output and root-actor checks reconstruct a public TEnv through modulePublicTEnv only for the single module they describe. allClasses enumerates public names through each module's lookup function rather than the raw constructor records, which matters once later passes rewrite imported entries on demand: the lookup function is where converted entries will appear.
Each back pass rewrites every loaded module's interface TEnv through mapModules: Normalizer, Deactorizer, CPS, LambdaLifter, and the protocol converter all walk and rebuild the full NModule tree once per compiled module, no matter how few imported names the module touches. Replace mapModules with convertModules, which wraps each ModuleInfo's lookup function with the pass's conversion: a demanded name is fetched from the underlying view, converted, and memoized, so conversion work follows actual lookups. Passes that fabricate new names from existing entries — the deactorizer's newact definitions, the converter's derived protocol classes — provide a source-name function mapping a generated name back to the entry it derives from, so a lookup of the generated name converts its source entry and picks the result out of the produced bindings. The modules tree is no longer rewritten by the passes, so forceTypeResult stops forcing the hmodules mirror.
Importing a module that is not yet in the env decodes its whole .tydb through readFile, and reusing a fresh cached module in the build graph does the same through readIfaceFromTy. Either way a dependent module pays for the full size of its dependencies before type checking starts. Make both paths install lazy shells instead. doImp opens the imported module's .tydb with openInterfaceDB, reads only the import list and docstring, and registers a ModuleInfo backed by keyed reads; the recursive import-closure walk follows the recorded imports as before. readIfaceFromTy reads the stored module hashes plus the same metadata, and FrontResult carries the shell so the scheduler's accumulated environment registers it directly. Freshly compiled modules keep registering eager TEnv-backed views built from their front results. A missing, corrupt, or version-mismatched .tydb is treated as a cache miss when opening the handle (openInterfaceDBMaybe), matching the old full-read behavior of falling back to recompilation.
The front pass's hashing phase resolves recorded hashes for every external name a module depends on. It did so by loading a complete per-module name-hash map for each referenced dependency — a whole name-hash section decode per dependency, proportional to dependency size rather than to the handful of names actually referenced. Resolve each external dependency with one targeted read instead. resolveDepHashes becomes IO and asks the per-name resolver for exactly the (module, name) pairs in the dependency sets. In-graph providers still answer from the scheduler's in-memory name map; only out-of-graph modules such as __builtin__ and search-path dependencies fall through to keyed .tydb reads. The per-module map loading and its error plumbing disappear.
readHeader decodes every per-name hash row, and it backs all the cheap-looking paths: cached-module discovery, root-stub expectation, test listing, import discovery, and DBP preparation. A fresh cached dependency therefore still costs a whole name-hash section decode on every build. Add readHeaderSummary, which reads the same metadata plus the stored name count but no per-name rows, and switch those paths over. TyTask carries the name count — enough for the DBP threshold and doc-output decisions — instead of the hash list, so cached tasks hold no name hashes at all; a dependent that needs one of a cached provider's hashes falls back to a targeted read (providerIsCachedTy). DBP preparation reads only the roots row and expands its selection closure through exact name-hash and extension-index reads (dbpNameHashClosure), so selection cost follows the seed closure rather than the module size. The verbose hash-delta report falls back to a full map read only when the previous hashes are not already in memory.
Each targeted name-hash read opens the dependency's .tydb, runs one transaction, and closes the environment again. During hashing of a large module the same few builtin hashes are resolved once per local name, which multiplies into tens of thousands of redundant reads. Memoize per-name results in a process-wide map, kept separate from the existing per-module map cache because that cache's entries must stay complete. Cached modules now report empty hash lists, so updateNameHashCache also skips empty results to keep them from shadowing targeted reads behind an apparently-complete empty module map.
Every reader has moved over to ModuleInfo, so the duplicated module storage can go. EnvF drops the NModule TEnv tree and its hashed mirror, and the ModuleInfo map becomes the modules field. addMod reduces to registering a TEnv-backed view, getImps no longer rebuilds a hash mirror after loading imports, and the tree walkers — lookupModule, lookupMod, findHMod, lookupHMod, lookupHModule — are deleted. Completion drops its manual mirror refresh, and the benches force the module map through its lookup functions instead of the old HTEnv, which exercises the same thunks the compiler demands during a real build.
Lock down the selective-read behavior with traced tests, using ACTON_TYDB_TRACE_READS to observe exactly which .tydb operations a build performs against a generated class-heavy dependency. The incremental cases assert that a small importer performs keyed lookups only — exact name-info reads, attribute-index reads, and keyed extension reads — and never decodes a whole section (all, public-names, constructors, name-hash-all). An unused import must not read any names at all, and adding an unrelated name to the dependency must keep the importer fresh while reading only the name it actually uses. The module_lookup compiler project covers imported classes, extension witnesses, and actors end to end through codegen.
Three costs crept in with per-query witness merging, all visible on large-module builds. Witness dedup unaliased both sides of every comparison, on a function the solver calls for every protocol constraint; comparing the recorded names directly matches the old insert-time check, so drop the unaliasing. While compiling __builtin__ itself the env has no imported modules, but importedModuleInfos rebuilt the builtin self-view on every query; return nothing in that case so builtin compilation stays on its local tables. And the dependency index deduplicated per-name user lists with quadratic nub, which dominated .tydb writes for large modules; collect users in sets instead.
Update the dev guide for the ModuleInfo world. imports_and_envs.md describes the module shells, the selective lookup path, the on-demand back-pass conversion, and how solver queries map onto the narrow per-module indexes, including the read-time completion of inherited attribute owners. interface_caches.md documents the new key layout — public-names/constructors/actors and the per-query index rows — along with the InterfaceDB handle, its read API, and the header-summary read used for cache checks. The debugging advice now points at broad enumeration as the thing to look for when a full read shows up during normal import lookup.
DBP prunes a provider's typed module down to the statements owned by the selected names, but reconstructing that module means decoding every stmt/<NNN> row even when the selection covers a handful of names. The typed statements are already stored individually; what is missing is a record of which statements belong to which top-level name. Record that ownership at write time. Each NameHashInfo row gains the indexes of the typed top-level statements that declare its name, computed by walking the module body once: declarations map through their declared names, signatures through their signed names, and assignments through their bound patterns. A module-level stmt-has-not-impl flag is stored alongside, because modules containing NotImplemented hooks pair native extensions with the surrounding module text and can only be regenerated from the whole body. readSelectedModule is the matching reader: given the selected names and their hash rows, it maps the selection to statement indexes and decodes exactly those stmt/<NNN> rows, returning Nothing - and leaving the caller to fall back to a full read - when the cache predates statement ownership, when ownership is missing for a selected name, or when the module carries NotImplemented hooks. The NameHashInfo encoding changes shape, so the .tydb version is bumped.
When a deferred back job finds its selection-sensitive codegen hash stale - the importer started using a new name, or the provider itself changed - prepareDeferredBackJob still read the provider's whole .tydb through readFile to obtain the typed module, then pruned it in memory. For a large provider that one read dominates the rebuild even though the selection is typically a few names. Keep the name-hash rows gathered while expanding the selection closure (dbpNameHashClosure now accumulates the rows instead of just the names) and hand them to readSelectedModule, which decodes exactly the typed statements owned by the selected names. The pruning and codegen pipeline downstream is unchanged: the selected module flows through selectDbpModule and mkEnv as before. When statement ownership is unavailable - an old cache, a missing owner, or a module with NotImplemented hooks - the old full read remains as the fallback. The traced regression test forces DBP on a cached class-heavy dependency and asserts that regenerating its code performs exact name-hash reads plus one selected-statement read, with no whole-section decodes.
interface_caches.md gains the stmt-has-not-impl key, the statement indexes carried by each NameHashInfo row, and a description of how DBP reconstructs a pruned typed module from the selected stmt/<index> records with readFile remaining as the fallback. imports_and_envs.md narrows its full-read inventory accordingly: DBP back-pass execution is now a fallback rather than an unconditional full read.
After rebasing the type-id candidate indexes (name-solver-witness-indexes) together with the selective tydb reads, imported modules are no longer preloaded into the in-memory type-id lattice, so the lattice covers only local definitions. The solver candidate enumeration therefore combines two sources: local candidate classes/protocols/actors come from the lattice (O(1) attribute lookup, preserving the speedup on large local modules), and imported candidates come from the per-module on-disk indexes. allConAttr/ allProtoAttr/allActors keep their imported half factored out as importedConAttr/importedProtoAttr/importedActors so the solver can pair it with the lattice.
A background sampler thread snapshots cumulative compiler progress and RTS memory at a fixed time cadence and writes one machine-readable line per tick, for plotting throughput and memory over long performance runs. It taps the existing generic progress hooks (CompileHooks) in runCompilePlan, so a single wrapper covers parse, type check and the back passes without touching any individual pass. Each line reports elapsed time, parse/type/back counters, the instantaneous type-check rate (the signal that reveals pace cliffs on large modules), RTS memory in use and live data, GC count, and the current activity. Enabled by ACTON_ANALYTICS=<interval_ms>; output goes to stderr or to ACTON_ANALYTICS_FILE. Disabled and zero-overhead otherwise. Memory columns require the process to run with +RTS -T.
readSelectedModule pooled the statement indices of every selected name with Data.List.nub, which is O(n^2). For a large module whose DBP name closure covers most names (juniper_crpd: ~288k statements), that nub ran for many minutes on a single core with no progress output -- the build appeared hung after "Wrote .tydb". Deduplicate with Data.IntSet instead (O(n log n)); toAscList also returns the indices sorted, so it subsumes the following sort. Result is identical, just without the quadratic.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rebases the selective-tydb-read series onto current main and appends the cleaned-up deferred-back-pass commits. Incremental builds now read only the interface names actually needed from each dependency's .tydb rather than whole interfaces, route imported interfaces through ModuleInfo and on-demand indexes, and use deferred back passes so a changed module regenerates only the code a change actually reaches. A time-sampled build progress diagnostic is included for long compiler runs. The individual commits cover the specifics.