Skip to content

Releases: saforem2/ezpz

v0.12.7

04 May 16:05

Choose a tag to compare

v0.12.7

4 May 2026

  • Merge pull request #129 from saforem2/yeet-refactor 426a204
  • docs(yeet): collapse uv-Python warning, hoist Complete workflow up, fill in missing venv-create step 6898630
  • docs(yeet): expand uv-Python workaround with HPC module + standalone options 386e578
  • fix(docs/yeet): drop fictional 'uv venv --copies' flag 9c1c8d8
  • docs(yeet): warn about uv-managed Python; track proper fix in TODO Β§17 9127828
  • chore(.gitignore): exclude .venv.tar.gz 1b9d6cd
  • fix(tar-env): land tarball next to the venv, return absolute path 7049dc8
  • docs(yeet): correct the misleading 'incremental syncs' section 1c3347f
  • docs(yeet): drop the synthetic 4-node example output block 6851e7b
  • docs(yeet): advocate the tar-env + yeet pair as the scaling default 49cd14d
  • docs(yeet): update scaling figures to match the canonical pair 9030f92
  • docs: escape brackets in admonition titles and tab labels 05cfa1d
  • docs(zensical): tidy CLI nav order, drop noisy validation warning f0ddd4a
  • docs(cli/index): restructure command list with hierarchy + footnote 91b19bf
  • docs(nav): actually add ezpz tar-env to zensical sidebar 48941b6
  • docs(nav): add ezpz tar-env to zensical sidebar 75c5195
  • fix(tar-env): keep -v (verbose) flag β€” print files as they're added 22f2dd0
  • fix(tar-env): actually gzip the .tar.gz output 49d747f
  • feat(yeet,kill): use ezpz.get_logger for timestamped output fc76fd1
  • feat(yeet): hint when a same-named .tar.gz exists nearby 168cbef
  • fix(yeet): probe rsync for --info=progress2 support, fall back on macOS 76b0f6c
  • fix(yeet,kill): always pass list (not None) to run() from Click 7795511
  • docs(reference): replace hand-coded ANSI HTML with plain code fences 848e03c
  • docs(test): refresh stale logger paths in cli/test.md c62a0e8
  • docs: address audit findings (#2, #4, #5, #6) 89d8793
  • docs(cli): promote yeet, kill, tar-env out of the experimental block 0a19a67
  • docs(yeet): switch scaling plots to linear axes f4eef10
  • docs(yeet): match the original split-chart style fd98514
  • docs(yeet): restore scaling guidelines on the split SVGs 9ea0a69
  • docs(yeet): split scaling chart into two separate SVGs 7b673ae
  • docs(yeet): use the scaling SVG from the companion blog post 4c8be0b
  • docs(yeet): linear y-axis on scaling plot d43a114
  • docs(yeet): polish β€” better lead, translucent palette, single combined SVG 8399675
  • docs(yeet): color greedy fan-out diagram by generation 8a43b61
  • chore(todo): add Β§15 (ZeRO-1 in wrap_model) and Β§16 (explicit DeepSpeed wrapper) e0a1f46
  • test(yeet,kill): strengthen coverage from review e1a29a8
  • docs(kill): capitalize Python (proper noun) db9db34
  • fix(yeet): generic-source footer counts only successful nodes 315200d
  • fix(kill): correct success accounting + drop StrictHostKeyChecking 01bd7c6
  • docs(fsdp): make memory-block widths proportional to actual GB cfac686
  • docs(yeet): add first-step latency column to Aurora scaling table 0275cb1
  • docs(yeet): add Aurora 8β†’4096 node scaling results b44479d
  • docs(kill): document ezpz kill command e8fe05d
  • feat(kill): add ezpz kill for cleaning up stuck distributed jobs 04cd14f
  • docs(yeet): rename yeet-env.md β†’ yeet.md, document positional + generic source b1af0df
  • feat(yeet): support arbitrary directory sources, not just venvs 9e6117b
  • feat(yeet): accept positional SRC argument f085179
  • refactor(yeet): rename yeet-env β†’ yeet, keep deprecated alias d55c38a
  • fix(utils.sh): use log_message in ezpz_save_pbs_env "to calculate" block dbaeaed
  • chore(todo): add Β§14 β€” bin/utils.sh cleanup items 5ddc52f

v0.12.6

01 May 13:59

Choose a tag to compare

v0.12.6

1 May 2026

  • fix(examples/vit): Make the loss actually decrease (#128) 12ffcaa

v0.12.5

30 Apr 12:06

Choose a tag to compare

v0.12.5

30 April 2026

  • feat(flops): Add MFU tracking module with peak FLOPS database (#127) 7cf33d7

v0.12.4

25 Apr 22:32

Choose a tag to compare

v0.12.4

25 April 2026

  • refactor: Remove IPEX, rewrite yeet-env, improve benchmarks and docs (#126) e7a044f

v0.12.3

18 Apr 15:23

Choose a tag to compare

v0.12.3

18 April 2026

  • fix(docs): Use fontawesome-brands-github icon in example tables (#125) bc3ba6f
  • style(docs): Prevent table column text wrapping 3a739a4

v0.12.2

14 Apr 21:34

Choose a tag to compare

v0.12.2

14 April 2026

  • fix(launch): Remove output filtering, add --line-buffer to mpiexec (#124) a57fbb1

v0.12.1

14 Apr 19:06

Choose a tag to compare

v0.12.1

14 April 2026

  • docs(guide): Add HF custom loop example, rename Trainer section (#123) 0755cdb
  • Revert "docs(guide): Add HF custom loop example, rename Trainer section" 892d8ca
  • docs(guide): Add HF custom loop example, rename Trainer section d99c079

v0.12.0

14 Apr 03:57

Choose a tag to compare

v0.12.0

14 April 2026

  • Merge pull request #122 from saforem2/dev 9a5fab9
  • chore: Move bench_trackers.sh into scripts/ bfa95f6
  • docs(guide): Use prefixed keys, logger, and finalize in DDP example 8d2b645
  • fix(docs): Clarify that RNG seeding is opt-in via seed= parameter 21f9d11
  • docs(TODO): Add tracker follow-up items 74c2b61
  • fix(tracker): Improve MLflow error messages, suppress per-step 403 spam ff5aa15
  • fix(tracker): Fix MLflow auth patch and experiment naming 59f636b
  • revert(tracker): Restore MLFLOW_TRACKING_TOKEN guard for system metrics 413b59a
  • docs(history): Document grouped finalize() output and return type 45f5505
  • fix(history): Log grouped datasets clearly from finalize() 7b42891
  • fix(dist): Work around xccl split_group regression in PyTorch nightly a5fd044
  • refactor(history): Save per-group datasets without NaN padding a74d0d2
  • fix(history): Include group prefix in plot titles and filenames ca816f0
  • refactor(history): Group metrics by prefix for independent plot axes 30824c5
  • fix(tracker): Enable MLflow system metrics unconditionally d103a43
  • docs(quickstart): Update Next Steps with new pages and labels ef485f5
  • docs(index): Update overview and features for new functionality 1d1200b
  • fix(dist): Use explicit empty-string check for env var lookups b59d981
  • fix(docs): Add missing nullcontext import in gradient accumulation recipe d6e926c
  • fix(log): Reset time/date styles to empty defaults e39de96
  • fix(docs): Remove broken anchor link to deleted reference.md section d6e4fa1
  • docs: Merge tracker guide into experiment tracking page 75c9ec3
  • docs(recipes): Add data loading, checkpointing, and gradient accumulation a57e2b6
  • docs(quickstart): Use FSDP default, update cross-links 34aa13b
  • docs: Rename "Complete Example" to "End-to-End Walkthrough" 8d31c46
  • docs(config): Add common configurations section and distributed training guide 770b48b
  • docs(architecture): Simplify dist.py shim description, add guide link 43aacbe
  • docs(troubleshooting): Add distributed hang and FSDP error sections 0b84eb8
  • style(docs): Update font families and add mermaid diagram styles 3b8521e
  • fix(docs): Correct cclβ†’xccl backend names, fix bracket syntax 90b85be
  • chore(log): Remove commented-out style lines f08c5e8
  • feat(tracker): Support EZPZ_TRACKERS shorthand env var alias ae7823a
  • fix(submit): Default --launch to on, add --no-launch to opt out dc70989
  • fix(init): Add semver-safe get_torch_version_tuple() and use it f6248e8
  • fix(log): Close single-bracket even when no prefix components render d6a4812
  • fix(history): Fix and/or precedence in _tracker_got_config check fec5fa5
  • feat(benchmark): Add --run alias with comma-separated example names a60f602
  • feat(report): Add MLflow links, fix table column alignment f819015
  • docs(quickstart): Add linenums to code blocks and fix line wrapping 207bbb0
  • fix(history): Fall back to EZPZ_TRACKER_BACKEND env var and fix formatting 37938c2
  • style(log): Adjust day/time and repr.colon styles for readability e0202c4
  • style(tracker): Color mlflow label bright red in stderr output 9999e03
  • fix(pbs): Cache qstat results to avoid redundant calls during launch c91a078
  • fix(dist): Respect pre-set MASTER_ADDR/MASTER_PORT in DDP setup 6251373
  • fix(tests): Scrub stale distributed env vars in FSDP-TP launch test a20b9b5
  • fix(tests): Skip MPI tests on non-HPC and prevent hangs 2a09676
  • fix(pbs): Retry qstat on transient PBS server errors ac44ff9
  • fix(pbs): Stop sh import error from polluting test output d8b271e
  • fix: Address review findings from PR #122 f27156b
  • fix(log): Swap day_color and time_color styles 5ffd71b
  • fix(tracker): Improve MLflow dotenv error handling 776977c
  • feat(dist): Add setup_mlflow() convenience function 1191120
  • fix(log): Improve log timestamp visibility d13774d
  • docs: Fix README default, consolidate backend docs, rename Reference cf2a440
  • fix(tests): Accept AssertionError in PBS nodefile path test 773a75f
  • docs(recipes): Add MLflow tracking recipe 7231d16
  • docs(tracker): Document MLflow as built-in backend with full setup guide a069108
  • feat(dist): Add --fsdp-sharding-strategy CLI arg and reshard_after_forward a8fa27f
  • fix(dist): Skip ModuleList/ModuleDict in _wrap_fsdp2 569b164
  • docs: Add timing comparison table stub to tracker docs 89623a6
  • chore: Add tracker backend benchmark script fc94e85
  • feat(dist): Make FSDP2 (fully_shard) the default in wrap_model f8f4c38
  • test(log): Add tests for log config env vars and prefix styles f1e7aae
  • feat(tra...
Read more

v0.11.3

29 Mar 21:38

Choose a tag to compare

v0.11.3

29 March 2026

  • chore(nav): Restructure nav, add Recipes, promote FAQ to Guide 1935686
  • test(recipes): Add tests for docs recipe code snippets 58fe520
  • docs(history): Add metric tracking guide for History class 536a5bc
  • docs(quickstart): Add uv-run and ezpz-test verification sections d33f03d
  • docs(faq): Add general FAQ section and fix HTML tag 8083783
  • docs(index): Streamline homepage with better examples and try-it-out section 8306281
  • fix(docs): Update CLI references from ezpz-launch to ezpz launch 601a81b
  • docs(recipes): Add Polaris output tabs from 2-node run 8f7cd5f
  • chore: Update scripts/capture_recipe_outputs.sh 30673c0
  • docs(recipes): Add tabsets with runnable code + output f2456b5
  • chore: Add new filters for Aurora aa22a39
  • chore: Add new filters for Aurora fe1e408
  • chore: Update arguments in examples/diffusion.py 211fcbb
  • chore: Update scripts/* 330e269
  • chore: Add new filters for Aurora 3acb752
  • chore: Update arguments in examples/diffusion.py ea06637
  • chore: Update scripts/* 397a30f
  • chore: Update scripts/run_benchmarks.py 5354443
  • chore: Update scripts/run_benchmarks.py c50a6a5
  • fix(examples/hf): handle removed overwrite_output_dir attribute 2f852d5
  • fix(scripts): remove deprecated --include-tokens-per-second flag db99de2
  • docs: remove duplicate guide.md and update nav a008e4f
  • docs(faq): trim verbose MPI and launcher output 85e8ea2
  • docs(examples): add example picker table and intro paragraph 52d3207
  • docs(cli): reorganize command listing and rewrite launch page 906063a
  • docs(architecture): fix xccl backend name and add wrapping strategy table b73c71b
  • fix(history): add detach() before numpy conversion 59782fa
  • chore: Align tables in scripts/generate_report.py ee18997
  • chore: Update scripts/run_benchmarks.sh ed0e463
  • docs(reference): annotate SequentialLinearNet import faaef4b
  • docs(index): add target audience line e01b538
  • docs(examples): deepen import and preset annotations 0a475e9
  • docs(includes): collapse MPI noise and remove orphaned files 15a6ea3
  • docs: slim reference.md and update cross-links 9086a30
  • docs(quickstart): migrate shell env, launcher examples, and API cheat sheet 59cc228
  • fix(docs): use sequential numbered lists in reference and configuration 370d576
  • fix: Fix missing import in distributed.py d9dc204
  • feat(scripts): add benchmark runner and report generator b0e7f27
  • refactor(examples): initialise wandb earlier for full console capture 3c0532c
  • fix(log): forward redirect param in get_console to fix wandb log capture 7b6d437
  • docs: use tags in summary elements for inline code rendering 0d4c4bc
  • docs: add markdown attr to walkthrough details for inline code rendering e43fe11
  • docs: collapse Code Walkthrough subsections by default 84988df
  • docs: rewrite walkthroughs to cover full source files top-to-bottom 05cf563
  • docs: move collapsed Source sections to top of example pages 640e12c
  • docs: deprecate minimal example, add collapsed source to all examples 8e24879
  • fix(distributed): fall back to DDP when FSDP is unsupported on CPU/MPS f828d3c
  • fix(tests): update slurm test expectations for --gpus-per-node flag 11e5ec3
  • docs: remove What to Expect sections, use direct source code b5ee90b
  • docs: add diagrams, fix quickstart prose, reorder example nav 1ab7545
  • docs: add walkthroughs and expected output to example pages ebabc5c
  • docs: add HF causal LM example page with code walkthrough 64450f0
  • docs: add minimal example page with code walkthrough 51e35f7
  • docs: restructure nav, update README, and polish site ca2787e
  • docs: add key API callouts to example pages 94bae50
  • docs: add architecture, guide, troubleshooting, and perlmutter pages 6e54cfe
  • fix(utils.sh): load required modules for Perlmutter conda setup 3234016
  • fix(distributed): zero-pad node index in print_dist_setup for alignment e39fb85
  • fix(slurm): skip (null) nodelist entries and add --gpus-per-node to srun 94bf48f
  • refactor(history): clean up log_metrics output 4640b5e
  • fix(history): use correct path in report log message d0f427e
  • Merge branch 'wip' of https://github.com/saforem2/ezpz into wip 78a2ed2
  • Merge branch 'wip' of https://github.com/saforem2/ezpz into wip b219b3f
  • Merge branch 'wip' of https://github.com/saforem2/ezpz into wip d6857cc
  • test(distributed,launch): fix tests that fail inside MPI jobs [28154e7](281...
Read more

v0.11.2

01 Mar 22:02

Choose a tag to compare

v0.11.2

1 March 2026

  • Merge pull request #119 from saforem2/dev d260a27
  • docs: Update docs 76af046
  • chore: Update examples/hf_trainer.py 5624cb2
  • docs: Update docs bd932cb
  • chore: Update pyproject.toml 06a3858
  • chore: Update zensical.toml 4e1fdd2
  • docs: Update includes/* a4b3bc8
  • chore: Update src/ezpz/__init__.py 83de93a
  • chore: Update src/ezpz/cli/flags.py 0f668bd
  • chore: Update src/ezpz/launch.py bc0af25
  • chore: Update src/ezpz/pbs.py 1d0e214
  • chore: Update src/ezpz/test.py 576820b
  • chore: Update tests/ 40dfbb5
  • chore: Update ezpz/examples/* de2a724
  • chore: Update ezpz/examples/cria.py b47d6ea
  • chore: catch empty train history gracefully 63e2c51
  • chore: Update timings in examples/vit.py 4379632
  • chore: Group timings in examples/*.py d39cf41
  • chore: add @ezpz.timeitlogit decorators to examples/*.py d51941f
  • feat: Update examples/hf.py 30e6116
  • chore: Update examples/test.py, cli/flags.py 09c0848
  • chore: Update src/ezpz/dist.py a4438a3
  • chore: Track branch in wandb.run.config 7a5023b
  • feat: Add timings to examples/*.py 847b12c
  • chore: Update src/ezpz/configs.py 77abd53
  • chore: Update src/ezpz/dist.py 9aaf67c
  • chore: Update JSON logger in ezpz/launch.py f4f451b
  • feat: Unified, consistent directory names in examples/*.py 5465be2
  • feat: Unified, consistent directory names in examples/*.py 7542852
  • Update ezpz/log/formatters.py 4d65ed2
  • chore: Update src/ezpz/dist.py 768ef60
  • chore: Update src/ezpz/dist.py 38e3378
  • chore: Update ezpz/dist.py 19a4b70
  • chore: Update src/ezpz/log/formatters.py 7f3a9dc
  • chore: Update src/ezpz/cli/test_cmd.py 029acef
  • chore: Update src/ezpz/conf/hydra/job_logging/custom.yaml d7a7286
  • chore: Update src/ezpz/configs.py c740b0e
  • chore: Update src/ezpz/launch.py 64b2ce8
  • chore: Update src/ezpz/log/__init__.py 93d7780