Skip to content

Commit d245abc

Browse files
committed
docs: update README
1 parent 39706c8 commit d245abc

1 file changed

Lines changed: 40 additions & 24 deletions

File tree

README.md

Lines changed: 40 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,32 @@
11
# ITSxRust
22

3-
Fast ITS region extraction in Rust (HMMER-based), designed for long-read amplicon data (ONT / PacBio HiFi) and general FASTA/FASTQ inputs.
3+
ITS subregion extraction for fungal metabarcoding at long-read scale.
4+
5+
As long-read amplicon sequencing (Oxford Nanopore and PacBio HiFi) becomes routine, extracting ITS subregions (ITS1, 5.8S, ITS2, full ITS) reliably at scale can become a throughput and robustness bottleneck. ITSxRust is a Rust-based ITS extractor that follows the standard approach of locating conserved ribosomal flanks using profile-HMMs (via HMMER), while adding long-read–oriented features for reproducible, high-throughput processing.
46

57
## Features
6-
- Extract ITS1, ITS2, and/or full ITS region(s)
7-
- Works with FASTA and FASTQ inputs (optionally gzipped if supported in your build)
8-
- Produces extracted sequences plus optional boundary/anchor reporting
9-
- Designed to be fast and reproducible
8+
- HMMER/profile-HMM–based detection of conserved ribosomal flanks to extract ITS subregions
9+
- Supports long-read workloads (ONT / HiFi) with built-in parameter presets
10+
- Optional dereplication to reduce redundant HMMER searches
11+
- Partial-chain fallback: recover subregions using two-anchor pairs when a full four-anchor chain is unavailable
12+
- Structured failure diagnostics and QC summaries to help understand why reads were skipped or partially recovered
13+
- Works with FASTA and FASTQ inputs
1014

1115
## Install
1216

13-
### From source (developer install)
17+
### Prebuilt binaries (recommended)
18+
Download the appropriate binary for your OS from GitHub Releases:
19+
20+
- GitHub → Releases → `v0.1.0`
21+
22+
Then:
23+
24+
```bash
25+
chmod +x itsxrust
26+
./itsxrust --help
27+
```
28+
29+
### From source
1430
Requires Rust (stable) and Cargo.
1531

1632
```bash
@@ -25,39 +41,35 @@ cargo install --path .
2541
itsxrust --help
2642
```
2743

28-
### Planned distribution
29-
The manuscript version will provide:
30-
- Bioconda recipe
31-
- Prebuilt binaries (GitHub Releases)
32-
- Container images (GHCR)
44+
### Dependency: HMMER
45+
ITSxRust coordinates HMMER searches (e.g., `hmmscan`) to locate ribosomal flanks. Ensure HMMER is available in your environment for typical extraction workflows.
3346

3447
## Usage
3548

36-
Basic help:
49+
Help:
3750

3851
```bash
3952
itsxrust --help
4053
itsxrust extract --help
4154
```
4255

43-
Example extraction (adjust flags to match your CLI):
56+
Example extraction:
4457

4558
```bash
46-
itsxrust extract \
47-
--input data/example.fastq \
48-
--hmm bench/sim/hmmer/F.hmm \
49-
--region its2 \
50-
--output out_dir/
59+
itsxrust extract --input reads.fastq.gz --hmm path/to/F.hmm --region its2 --output out_dir/ --hmmer-cpu 8
5160
```
5261

62+
Presets (ONT / HiFi) are available via the CLI (see `itsxrust extract --help`).
63+
5364
## Inputs / Outputs
65+
5466
**Inputs**
55-
- FASTA / FASTQ
56-
- HMM model file (HMMER)
67+
- FASTA / FASTQ (optionally gzipped)
68+
- HMM model file (profile-HMMs for ribosomal flanks)
5769

5870
**Outputs**
59-
- FASTA of extracted regions (e.g. ITS1 / ITS2 / full)
60-
- Optional tables/JSONL with anchors/boundaries (if enabled)
71+
- FASTA of extracted regions (ITS1 / ITS2 / full)
72+
- Optional anchor/boundary outputs (TSV/JSONL) and QC summaries (if enabled)
6173

6274
## Development
6375

@@ -69,7 +81,7 @@ cargo clippy --all-targets --all-features -- -D warnings
6981
cargo test
7082
```
7183

72-
Benchmarks & scripts live in `bench/`.
84+
Benchmarks and simulation scripts live in `bench/`.
7385

7486
## Project layout
7587
- `src/` Rust source
@@ -80,8 +92,12 @@ Benchmarks & scripts live in `bench/`.
8092

8193
Large datasets and generated outputs should stay untracked.
8294

95+
## Roadmap
96+
- Container images (GHCR)
97+
- Bioconda recipe
98+
8399
## License
84100
MIT (see `LICENSE`).
85101

86102
## Citation
87-
A `CITATION.cff` will be added for the public release.
103+
If you use ITSxRust, please cite the repository metadata via GitHub’s “Cite this repository” button (powered by `CITATION.cff`).

0 commit comments

Comments
 (0)