A professional, completely local and private audio transcription system that works with any markdown-based note-taking application. Transform your audio recordings into searchable markdown transcripts without sending your data to the cloud.
- π 100% Local & Private - Audio never leaves your machine
- π° Zero Ongoing Costs - No API keys or subscription fees
- π± Universal Compatibility - Works with Obsidian, Logseq, Foam, Zettlr, and any markdown system
- π‘οΈ Security-First - External script approach, no plugins required
- π Template-Driven - Customizable output formats
- π― Smart Linking - Automatically adds transcript links to existing notes
- β‘ Batch Processing - Handle multiple files efficiently
- π¨ Configurable - JSON/YAML configuration with multiple profiles
# Install system dependencies (required: ffmpeg for audio processing)
# Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg
# macOS with Homebrew:
brew install ffmpeg
# Arch Linux:
sudo pacman -S ffmpeg
# Install Python dependencies using UV (recommended)
uv sync
# Alternative: Use the installation script
chmod +x scripts/install.sh && ./scripts/install.sh# Create example configuration for your markdown system
uv run python -m src.transcription_system --create-config config.yaml --config-type obsidian
# Or create for other systems
uv run python -m src.transcription_system --create-config config.yaml --config-type logseq
uv run python -m src.transcription_system --create-config config.yaml --config-type foamEdit config.yaml to match your setup:
# Basic configuration
vault_path: "/path/to/your/notes"
audio_folder_name: "Audio"
transcripts_folder_name: "Audio-Transcripts"
# Whisper settings
whisper_model: "medium" # tiny, base, small, medium, large
language: "auto" # or specify: en, de, fr, es, etc.
# Link format (adjust for your markdown system)
link_format_style: "wikilink" # wikilink, standard, or custom
link_format_prefix: "π **Transcript:**"# Run comprehensive test suite (safe - creates isolated test environment)
./test_system.sh
# The test will:
# - Create a temporary test environment using uv
# - Test all system components safely
# - Generate a detailed test report
# - Optionally test actual transcription with sample audio# Run with your configuration
uv run python -m src.transcription_system --config config.yamlBefore using the system on your actual files, it's highly recommended to run the test suite:
The included test script provides safe, isolated testing:
# Make the test script executable (if not already)
chmod +x test_system.sh
# Run the test suite
./test_system.shWhat the test does:
- β Creates isolated test directory with timestamp
- β
Uses
uvfor clean virtual environment - β Tests all system components without affecting your files
- β Creates sample audio and markdown files for testing
- β Generates detailed test report
- β Optional real transcription test with tiny model
Test Components:
- Import Tests - Verifies code loads correctly
- Configuration Tests - Tests config creation and validation
- System Tests - Tests main system initialization
- Template Tests - Tests template loading
- File Discovery - Tests finding audio files and notes
- Integration Tests - Tests complete workflow
- Optional Transcription - Real transcription with sample audio
If you prefer manual testing:
# Test configuration creation
python -m src.transcription_system --create-config test-config.yaml --config-type obsidian
# Test with dry-run on a copy of your vault
cp -r /path/to/your/vault /tmp/test-vault
# Edit test-config.yaml to point to /tmp/test-vault
python -m src.transcription_system --config test-config.yaml# Clone the repository
git clone https://github.com/yourusername/markdown-audio-transcription.git
cd markdown-audio-transcription
# Install dependencies
pip install -r requirements.txt
# Install Whisper and system dependencies
# Ubuntu/Debian (recommended):
sudo apt install python-openai-whisper ffmpeg
# macOS:
brew install ffmpeg
pip install --global openai-whisper
# Windows:
# Download ffmpeg from https://ffmpeg.org/download.html
# pip install --global openai-whisper# Run the installation script
chmod +x scripts/install.sh
./scripts/install.sh- Link Format:
[[transcript_name]] - Audio Folder:
Audio/ - Transcript Folder:
Audio-Transcripts/
- Link Format:
[[transcript_name]] - Audio Folder:
assets/ - Transcript Folder:
transcripts/
- Link Format:
[transcript_name](transcript_name.md) - Audio Folder:
attachments/ - Transcript Folder:
transcripts/
- Link Format:
[[transcript_name]] - Audio Folder:
media/ - Transcript Folder:
transcripts/
- Link Format:
[transcript_name](transcript_name.md) - Audio Folder:
media/ - Transcript Folder:
transcripts/
# Path to your notes/vault
vault_path: "/home/user/Notes"
# Folder names (relative to vault_path)
audio_folder_name: "Audio"
transcripts_folder_name: "Audio-Transcripts"
# Whisper AI settings
whisper_model: "medium" # Model size affects accuracy vs speed
language: "auto" # Auto-detect or specify (en, de, fr, etc.)
# Processing options
auto_move_files: true # Move processed files to audio folder
create_timestamps: true # Include detailed timestamps
skip_existing_transcripts: true # Skip files that already have transcripts
recursive_search: true # Search subdirectories for audio files# Link format customization
link_format_style: "wikilink" # wikilink, standard, or custom
link_format_prefix: "π **Transcript:**"
# File extensions to process
audio_extensions: [".mp3", ".wav", ".m4a", ".flac", ".ogg", ".aac"]
video_extensions: [".mp4", ".mkv", ".avi", ".mov", ".wmv", ".webm"]
# Logging configuration
log_level: "INFO" # DEBUG, INFO, WARNING, ERROR
console_logging: true # Log to console
file_logging: true # Log to file
log_file: "/var/log/markdown-transcription.log"
# System settings
temp_dir: "/tmp"
lock_file: "/var/lock/markdown-transcription.lock"
encoding: "utf-8"Edit templates/transcript-template.md:
# Transcription: {filename}
**File:** `{filename}`
**Date:** {date}
**Original Location:** `{audio_folder}/{filename}`
## Transcript
{transcript_content}
## Detailed Timestamps
{timestamp_content}Edit templates/link-template.md:
π **Transcript:** [[{audio_name}_transcript]]# Process all audio files in your vault
python -m src.transcription_system --config config.yaml# Create Obsidian configuration
python -m src.transcription_system --create-config obsidian-config.yaml --config-type obsidian
# Create Logseq configuration
python -m src.transcription_system --create-config logseq-config.yaml --config-type logseq
# Create generic markdown configuration
python -m src.transcription_system --create-config generic-config.yaml --config-type generic# Process different vaults with different configurations
python -m src.transcription_system --config work-vault.yaml
python -m src.transcription_system --config personal-vault.yamlCreate automatic processing on file changes:
# Set up systemd service
sudo chmod +x scripts/setup-systemd.sh
sudo ./scripts/setup-systemd.sh
# Enable and start service
sudo systemctl enable markdown-transcription
sudo systemctl start markdown-transcriptionProcess files periodically:
# Add to crontab (process every 30 minutes)
*/30 * * * * /usr/bin/python3 /path/to/transcription_system.py --config /path/to/config.yamlUse with file system watchers like inotify:
# Watch for new audio files and process automatically
inotifywait -m -e create --format '%w%f' /path/to/vault/ | while read file; do
if [[ $file =~ \.(mp3|wav|m4a)$ ]]; then
python -m src.transcription_system --config config.yaml
fi
done| Feature | This System | Whisper.cpp | Commercial APIs |
|---|---|---|---|
| Privacy | β 100% Local | β 100% Local | β Cloud-based |
| Cost | β Free | β Free | β Pay-per-use |
| Markdown Integration | β Native | β Manual | β Manual |
| Template System | β Built-in | β None | β None |
| Auto-linking | β Automatic | β Manual | β Manual |
| Multi-app Support | β Universal | β Generic | β Generic |
| Batch Processing | β Yes | β Yes | β Yes |
| Accuracy | β High (Whisper) | β High (Whisper) | β High |
| Model | Speed | Accuracy | VRAM Usage | Best For |
|---|---|---|---|---|
tiny |
Fastest | Good | ~1GB | Quick processing |
base |
Fast | Better | ~1GB | Balanced performance |
small |
Medium | Good | ~2GB | Most use cases |
medium |
Slower | Better | ~5GB | High accuracy needed |
large |
Slowest | Best | ~10GB | Maximum accuracy |
- 10-minute audio file:
tiny: ~30 secondsbase: ~1 minutesmall: ~2 minutesmedium: ~4 minuteslarge: ~8 minutes
1. "Whisper is not installed" error
# Method 1 - Package manager (recommended for Ubuntu/Debian):
sudo apt install python-openai-whisper
# Method 2 - Global pip installation:
pip install --global openai-whisper2. "ffmpeg not found" error
# Ubuntu/Debian
sudo apt install ffmpeg
# macOS
brew install ffmpeg
# Windows: Download from https://ffmpeg.org/3. "Permission denied" errors
# Fix file permissions
chmod +x scripts/*.sh
sudo chown -R $USER:$USER /path/to/vault4. "Another instance is already running"
# Remove lock file if stale
sudo rm /var/lock/markdown-transcription.lock5. Template not found errors
# Ensure templates directory exists
mkdir -p templates/
# Copy default templates from repositorySlow transcription:
- Use a smaller model (
tiny,base,small) - Close other applications to free up RAM/VRAM
- Consider using CPU-only mode for older hardware
High memory usage:
- Use smaller model
- Process files one at a time
- Increase system swap space
We welcome contributions! Please see our Contributing Guide for details.
# Clone the repository
git clone https://github.com/yourusername/markdown-audio-transcription.git
cd markdown-audio-transcription
# Install development dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt
# Run comprehensive test suite
./test_system.sh
# Run linting
flake8 src/
black src/This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for the excellent speech recognition model
- The markdown note-taking community for inspiration and feedback
- All contributors who helped improve this system
- Issues: Report bugs and request features on GitHub Issues
- Discussions: Ask questions and share ideas in GitHub Discussions
- Documentation: Find detailed guides in the docs/ directory
Made with β€οΈ for the markdown note-taking community