✨ New Features
- RAG support for local files (
.txt,.md,.markdown,.pdf,.docx,.html,.htm)
🛠️ Improvements
- LLM inference refactor. Splits LLM inference into provider-specific modules and centralizes shared logic
- better dual-embedding (local/cloud) support
- thought retrieval to use the latest user prompt
- adjusted embedding defaults/requirements so local embeddings kick in when no OpenAI key is available for non-OpenAI providers
- assistant-message indexing with a simple filter so only durable responses get stored in the thoughts vector DB
- a guard against empty context injection when no thoughts/summary are available to reduce prompt noise
- a 1200 max chars cap for the thoughts block to avoid prompt bloat from long messages
🐛 Bug Fixes
- fixes OpenAI request build always to have the recently updated conversation data
- fix dev server backend CORS issue
- fix chat history message management
- fix dev server settings write
Full Changelog: v1.2.2...v1.3.0