A chatbot project that utilizes GPT-4o mini, Langchain, and SQLite to allow users to perform Q&A with SQL databases using natural language.
┌─────────────────────────────────────────────────────────────────┐
│ USER INPUT │
│ (Natural Language Question) │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ LLM │
│ (OpenAI GPT-4o-mini via LangChain) │
│ │
│ • Load environment variables (API keys, config) │
│ • Initialize ChatOpenAI with temperature=0.0 │
│ • Parse user question │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ SQL GENERATION │
│ (create_sql_query_chain or SQL Agent) │
│ │
│ Option 1: Direct Chain │
│ • Generate optimized SELECT statement │
│ • Apply constraints (LIMIT, columns) │
│ │
│ Option 2: Intelligent Agent │
│ • List available tables │
│ • Fetch table schemas │
│ • Generate multi-step SQL if needed │
│ • Validate syntax before execution │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ DATABASE QUERY │
│ (SQLite via SQLDatabase utility) │
│ │
│ • SQLite Database (sqldb.db)
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ RESULTS │
│ (Raw Query Output from DB) │
│ │
│ • Retrieved data rows │
│ • Result metadata (row count, column info) │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ LLM FORMATTING │
│ (Answer formatting with PromptTemplate) │
│ │
│ Input: Question + SQL Query + Raw Results │
│ Process: LLM transforms raw data into natural language │
│ Output: Clear, formatted answer for user │
└──────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ USER OUTPUT │
│ (Formatted Response via Gradio UI) │
│ │
│ Display result to user in chat interface │
└─────────────────────────────────────────────────────────────────┘
| Layer | Technology | Purpose |
|---|---|---|
| LLM | OpenAI GPT-4o-mini | Language understanding & generation |
| Orchestration | LangChain | Chain management & tool integration |
| SQL Execution | SQLAlchemy | Database connection & query execution |
| Database | SQLite | Data storage (read-only) |
| UI | Gradio | Chat interface |
- Pre-loaded SQL Database (
data/sqldb.db)- Chinook sample database
- Contains Artist, Album, Customer, Invoice, etc. tables
Mode 1: SQL Query Chain (Simple queries)
- Fast direct path: Question → SQL → Execute → Format
- Best for straightforward Q&A
- Lower token usage
Mode 2: SQL Agent (Complex queries)
- Intelligent exploration: List tables → Get schema → Generate SQL
- Error recovery and retry logic
- Multi-step query support
- Best for complex analytical questions
- Restricts to SELECT statements only
- Validates SQL syntax before execution
- Limits table scope access
- Prevents DELETE, UPDATE, INSERT, or CREATE operations
brew install sqlite- Python 3.11
- Clone the repository
git clone git@github.com:variang/chat-with-sql.git
cd chat-with-sql- Create and activate virtual environment
python3.11 -m venv venv
source venv/bin/activate- Install dependencies
pip install -r requirements.txt- Set up environment variables
Create a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL_NAME=gpt-4o-miniThe easiest way to interact with the chatbot is through the Gradio web interface.
- Ensure you have the Chinook database set up:
sqlite3 data/sqldb.db
.read data/sql/Chinook_Sqlite.sql
.quit- Run the Gradio application:
python app.py- Access the web interface:
- Open your browser and go to:
http://localhost:7860 - You should see the Chat with SQL interface
- Open your browser and go to:
- Auto-Database Discovery: All
.dbfiles in thedata/folder are automatically discovered and listed - Database Selection: Choose from available databases using the dropdown selector
- Execution Modes: Switch between "SQL Agent" (default, more intelligent) and "SQL Chain" (faster, simpler)
- Chat Interface: Ask questions in natural language and get immediate responses
- SQL Query Visibility: Toggle to show/hide the generated SQL queries
- Chat History: View full conversation history with all questions and answers
For development and testing, you can use the interactive Jupyter notebook.
- Start Jupyter:
jupyter notebook-
Open the notebook:
- Navigate to
notebooks/chat_with_sql.ipynb - Run cells sequentially to test different components
- Navigate to
-
Features:
- Test SQL query chains
- Test SQL agents
- Explore database schemas
- Experiment with custom prompts
- YouTube Tutorial
- Langchain: introduction
- Gradio: Documentation
- OpenAI: Developer quickstart
- SQLAlchemy Documentation
