Skip to content

Commit dbd13c4

Browse files
committed
Document maolan-generate
1 parent dd0eb95 commit dbd13c4

2 files changed

Lines changed: 127 additions & 0 deletions

File tree

features.html

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -548,6 +548,78 @@ <h3 class="text-xl font-semibold text-white mb-3">Supported Formats</h3>
548548
</div>
549549
</div>
550550
</section>
551+
<section class="py-16 bg-slate-900">
552+
<div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8">
553+
<div class="grid lg:grid-cols-2 gap-12 items-start">
554+
<div class="order-2 lg:order-1">
555+
<div class="bg-slate-950/70 border border-slate-800 rounded-2xl p-6 shadow-2xl">
556+
<div class="flex flex-wrap gap-3 mb-5">
557+
<span class="px-3 py-1 rounded-full bg-cyan-400/10 border border-cyan-400/20 text-cyan-300 text-sm font-medium">AI Audio Generation</span>
558+
<span class="px-3 py-1 rounded-full bg-slate-900 border border-slate-700 text-slate-300 text-sm">maolan-generate</span>
559+
</div>
560+
<pre class="overflow-x-auto rounded-2xl border border-cyan-400/10 bg-slate-950 p-5 text-sm leading-7 text-slate-200"><code>cargo run -p maolan-generate --release -- \
561+
--model happy-new-year \
562+
--backend vulkan \
563+
--tags "ambient, cinematic, downtempo" \
564+
--length 12000 \
565+
--cfg-scale 1.5 \
566+
--topk 50 \
567+
--temperature 1.0 \
568+
--ode-steps 10 \
569+
--output output.wav \
570+
--lyrics "stars drift over the late train home"</code></pre>
571+
<div class="grid sm:grid-cols-2 gap-4 mt-6">
572+
<div class="rounded-xl border border-slate-800 bg-slate-900/60 p-4">
573+
<h3 class="text-white font-semibold mb-2">Backends</h3>
574+
<p class="text-slate-400 text-sm">CPU and Vulkan execution paths are available in the current Burn-based generate flow.</p>
575+
</div>
576+
<div class="rounded-xl border border-slate-800 bg-slate-900/60 p-4">
577+
<h3 class="text-white font-semibold mb-2">Modes</h3>
578+
<p class="text-slate-400 text-sm">Prompt or lyrics generation, plus decode-only reconstruction from a saved frames JSON.</p>
579+
</div>
580+
</div>
581+
</div>
582+
</div>
583+
<div class="order-1 lg:order-2">
584+
<h2 class="text-3xl font-bold text-white mb-6">maolan-generate</h2>
585+
<p class="text-slate-300 text-lg mb-8">Maolan includes a separate generation crate and CLI for HeartMuLa text-to-audio work, with the same runtime pieces used by the desktop app for in-process decode.</p>
586+
<div class="space-y-6">
587+
<div>
588+
<h3 class="text-xl font-semibold text-white mb-3">Current capabilities</h3>
589+
<ul class="space-y-2 text-slate-300">
590+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div>Prompt-driven or lyrics-driven audio generation with optional style tags</li>
591+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div>Model choices: <code>happy-new-year</code> and <code>RL</code></li>
592+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div>Controls for CFG scale, top-k, temperature, ODE steps, decoder seed, and output length in milliseconds</li>
593+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div><code>--decode-only</code> mode using <code>--frames-json</code> for saved token-frame output</li>
594+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div><code>--decode-threads</code> override for decode-only CPU worker count</li>
595+
<li class="flex items-center gap-3"><div class="w-2 h-2 bg-cyan-400 rounded-full"></div><code>--model-dir</code> override for local Burn exports instead of cache-based lookup</li>
596+
</ul>
597+
</div>
598+
<div>
599+
<h3 class="text-xl font-semibold text-white mb-3">Expected model assets</h3>
600+
<div class="grid sm:grid-cols-2 gap-4">
601+
<div class="bg-slate-950 border border-slate-800 rounded-xl p-5">
602+
<p class="text-cyan-400 font-semibold mb-2">HeartMuLa repos</p>
603+
<p class="text-slate-300 text-sm mb-2"><code>maolandaw/HeartMuLa-happy-new-year-burn</code></p>
604+
<p class="text-slate-300 text-sm mb-2"><code>maolandaw/HeartMuLa-RL-oss-3B-20260123</code></p>
605+
<p class="text-slate-400 text-sm">Expected files: <code>heartmula.bpk</code>, <code>tokenizer.json</code>, <code>gen_config.json</code>.</p>
606+
</div>
607+
<div class="bg-slate-950 border border-slate-800 rounded-xl p-5">
608+
<p class="text-cyan-400 font-semibold mb-2">HeartCodec repo</p>
609+
<p class="text-slate-300 text-sm mb-2"><code>maolandaw/HeartCodec-oss-20260123-burn</code></p>
610+
<p class="text-slate-400 text-sm">Expected file: <code>heartcodec.bpk</code>.</p>
611+
</div>
612+
</div>
613+
</div>
614+
<div class="bg-slate-950 border border-slate-800 rounded-xl p-5">
615+
<h3 class="text-xl font-semibold text-white mb-3">Integration path</h3>
616+
<p class="text-slate-300 text-sm">The CLI is not just a side tool. The desktop application uses the same generate crate and runtime components when launching AI audio generation from the GUI.</p>
617+
</div>
618+
</div>
619+
</div>
620+
</div>
621+
</div>
622+
</section>
551623
<section class="py-16 bg-slate-900">
552624
<div class="max-w-4xl mx-auto px-4 sm:px-6 lg:px-8">
553625
<div class="text-center mb-12">

workflow.html

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -697,6 +697,61 @@ <h3 class="mb-5 text-2xl font-semibold text-cyan-400">
697697
</div>
698698
</div>
699699

700+
<div class="glass-card rounded-3xl p-6 sm:p-8">
701+
<h3 class="mb-5 text-2xl font-semibold text-cyan-400">
702+
maolan-generate operations
703+
</h3>
704+
<div class="space-y-4 text-sm text-slate-300">
705+
<div class="rounded-2xl border border-slate-800 bg-slate-950/50 p-5">
706+
<p class="font-semibold text-white">GUI launch path</p>
707+
<p class="mt-2 text-slate-400">
708+
The desktop app launches the local
709+
<code>maolan-generate</code> binary and exchanges progress
710+
and result data over a socketpair IPC path.
711+
</p>
712+
</div>
713+
<div class="rounded-2xl border border-slate-800 bg-slate-950/50 p-5">
714+
<p class="font-semibold text-white">Runtime split</p>
715+
<p class="mt-2 text-slate-400">
716+
Prompt generation runs in a dedicated HeartMuLa
717+
token-generation subprocess, then decode continues
718+
in-process with HeartCodec.
719+
</p>
720+
</div>
721+
<div class="rounded-2xl border border-slate-800 bg-slate-950/50 p-5">
722+
<p class="font-semibold text-white">Model resolution</p>
723+
<p class="mt-2 text-slate-400">
724+
Without <code>--model-dir &lt;path&gt;</code>, model assets
725+
are resolved through the Hugging Face cache using
726+
<code>hf-hub</code>.
727+
</p>
728+
</div>
729+
<div class="rounded-2xl border border-slate-800 bg-slate-950/50 p-5">
730+
<p class="font-semibold text-white">Expected repos and files</p>
731+
<p class="mt-2 text-slate-400">
732+
HeartMuLa uses
733+
<code>maolandaw/HeartMuLa-happy-new-year-burn</code> or
734+
<code>maolandaw/HeartMuLa-RL-oss-3B-20260123</code> with
735+
<code>heartmula.bpk</code>, <code>tokenizer.json</code>,
736+
and <code>gen_config.json</code>. HeartCodec uses
737+
<code>maolandaw/HeartCodec-oss-20260123-burn</code> with
738+
<code>heartcodec.bpk</code>.
739+
</p>
740+
</div>
741+
<div class="rounded-2xl border border-slate-800 bg-slate-950/50 p-5">
742+
<p class="font-semibold text-white">CLI boundaries</p>
743+
<p class="mt-2 text-slate-400">
744+
The current CLI supports
745+
<code>--model &lt;happy-new-year|RL&gt;</code>,
746+
<code>--length</code> in milliseconds,
747+
<code>--decode-only</code> with
748+
<code>--frames-json</code>, and decode-only worker control
749+
through <code>--decode-threads</code>.
750+
</p>
751+
</div>
752+
</div>
753+
</div>
754+
700755
<div class="glass-card rounded-3xl p-6 sm:p-8">
701756
<h3 class="mb-5 text-2xl font-semibold text-cyan-400">
702757
Pitch-correction cache and render flow

0 commit comments

Comments
 (0)