Skip to content

v1.0.0.

Latest

Choose a tag to compare

@kreeedit kreeedit released this 04 Jul 10:08
· 15 commits to main since this release

FLAME is a Python-based tool with both Command-Line (CLI) and Graphical (GUI) interfaces, designed for identifying and analyzing formulaic language and text reuse, particularly in historical corpora like medieval charters. It uses a Leave-N-Out (LNO) n-gram approach, which is highly effective for detecting variant forms of expressions that differ due to scribal variations, regional dialects, or other textual modifications. It automatically learn normalization rules from the corpus itself (handling medieval ligatures and special characters), uses subword tokenization to handle rare words and morphological variants. Automatically suggest an optimal vocabulary size for the tokenizer based on the corpus's statistical properties. It perform both intra-corpus and inter-corpus comparisons, and automatically determine an optimal similarity cutoff score using Otsu's method.