Legal Brief Companion - RAG-Powered Legal and Technical Assistant

This project implements a modular Retrieval-Augmented Generation (RAG) assistant using LangChain, designed to answer legal and technical queries based on custom documents. It features document ingestion, semantic retrieval, prompt templating, and LLM-backed response generation—all wrapped in a clean CLI and Streamlit interface.

Project Structure

legal_brief_companion/
├── src/
│   └── legal_brief_companion/
│       ├── __init__.py
│       ├── config/
│       │   └── settings.py           # Application configuration and env loading
│       ├── ingestion/
│       │   ├── document_loader.py    # Load PDFs/texts into the pipeline
│       │   └── text_splitter.py      # Chunk large documents
│       ├── retrieval/
│       │   ├── vector_store.py       # Manage vector store persistence (Chroma)
│       │   └── retriever.py          # Query and rank relevant chunks
│       ├── llm/
│       │   ├── chain.py              # LLM orchestration and chains
│       │   └── prompt_templates.py   # Reusable prompt templates
│       ├── interface/
│       │   └── cli.py                # CLI for local usage
│       ├── utils/
│       │   └── helpers.py            # Shared utilities and helpers
│       └── tests/                        # Unit tests
│       |     ├── test_chain.py
│       |     ├── test_ingestion.py
│       |     └── test_llm.py
│       └── ingest.py                 # CLI/script entry for ingestion
├── data/
│   ├── documents/                    # User documents (PDFs, txt)
│   └── vector_store/                 # Persisted vector DB files
│       └── <instance-id>/
|
├── app.py                            # Streamlit / main app entry
├── pyproject.toml
├── requirements.txt
├── .env
├── .gitignore
└── README.md

🚀 Quick Start

Prerequisites

Python 3.9+
Groq API key
pip or poetry

⚙️ Setup Instructions

Clone the repository:

git clone <your_repository_url>
cd legal_brief_companion

Install the required dependencies:
```
pip install -r requirements.txt
```

Set up your environment variables in the .env file. Important: Do not hardcode your API key directly in the .env file. Instead, set it as an environment variable in your system.

GROQ_API_KEY=your_groq_key
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
LLM_PROVIDER=groq
LLM_MODEL=llama3-8b-8192
DATABASE_URL=sqlite:///data/knowledge_base.db
DOCUMENTS_PATH=data/documents
PERSIST_DIRECTORY=data/vector_store
MODEL_NAME=llama3-8b-8192
VECTOR_STORE_TYPE=chroma

debug = true

Set your Groq API key as an environment variable:

# PowerShell
$env:GROQ_API_KEY="your_api_key_here"

# Linux/macOS
export GROQ_API_KEY="your_api_key_here"

Ingest your documents: This process will create the vector store.
```
python src/legal_brief_companion/ingest.py
```
Run the application:
```
streamlit run app.py
```

🧑‍⚖️ Usage

After running the application, you can interact with the assistant through the command-line interface or the Streamlit interface. Provide your queries, and the assistant will respond based on the ingested documents.

🧩 Components

Document Ingestion: Load and chunk legal PDFs or text files using document_loader.py and text_splitter.py.
Vector Store Retrieval: Embed and retrieve relevant chunks using ChromaDB, managed by vector_store.py and retriever.py.
LLM Interaction: Generate answers using Groq-hosted LLaMA models, orchestrated by chain.py.
Prompt Templates: Modular prompts for transparent reasoning, defined in prompt_templates.py.
User Interface: Streamlit frontend (app.py) and CLI (cli.py) for flexible interaction.
Config Management: Pydantic-based .env loader for reproducibility via settings.py.

🧩 Core Components

Document Processing

PDF and text file ingestion
Intelligent document chunking
Semantic embedding generation

Retrieval System

ChromaDB vector store integration
Similarity-based chunk retrieval
Context ranking and selection

LLM Integration

Groq-hosted LLaMA models
Structured prompt engineering
Response chain orchestration

🧪 Testing

Basic unit tests are included under src/legal_brief_companion/tests/. Run with:

pytest src/legal_brief_companion/tests/

🛠️ Built With

🔒 Security Notes

Never commit your .env file or API keys to GitHub. The .gitignore file is set up to exclude secrets and unnecessary files.
Use environment variables for secrets
Follow security best practices
Regular dependency updates

🔒 License

This project is licensed under the MIT License.

📬 Contact

Built by Getahune Wondemenhu Alemayhu.

Email: [getahune.alemayhu@gmail.com] For questions, contributions, or collaboration, feel free to reach out or open an issue.

Contributions are welcome!

Please open issues or submit pull requests for improvements or bug fixes. Follow the existing code style and include tests for new features and suggestions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Legal Brief Companion - RAG-Powered Legal and Technical Assistant

Project Structure

🚀 Quick Start

Prerequisites

⚙️ Setup Instructions

🧑‍⚖️ Usage

🧩 Components

🧩 Core Components

Document Processing

Retrieval System

LLM Integration

🧪 Testing

🛠️ Built With

🔒 Security Notes

🔒 License

📬 Contact

Contributions are welcome!

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Legal Brief Companion - RAG-Powered Legal and Technical Assistant

Project Structure

🚀 Quick Start

Prerequisites

⚙️ Setup Instructions

🧑‍⚖️ Usage

🧩 Components

🧩 Core Components

Document Processing

Retrieval System

LLM Integration

🧪 Testing

🛠️ Built With

🔒 Security Notes

🔒 License

📬 Contact

Contributions are welcome!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages