A mobile-first coffee tracking application that integrates multiple AI providers (Groq/Llama, Google Gemini, Tesseract OCR) to automatically extract, structure, and enrich coffee data from bag label photos.
Live: the-bean-keeper.onrender.com
This started as a personal tool and became a proving ground for multi-provider AI integration. Each AI provider was selected for what it does best:
- Groq (Llama 3.1 8B) for structured data extraction: fast inference, JSON mode, low cost per call. Chosen over GPT/Claude for this task because extraction needs speed, not reasoning depth.
- Google Gemini Pro for video generation pipeline: multimodal capabilities for converting screen recordings into polished product demos.
- Tesseract.js for client-side OCR: runs in the browser, zero API cost, handles multilingual text (English + Chinese).
- ElevenLabs for voice synthesis in the video pipeline.
The architecture uses fallback logic between providers. If AI extraction fails, regex-based extraction catches common patterns. If cloud storage fails, local filesystem takes over. Every external dependency has a graceful degradation path.
- AI-Powered Extraction: Upload coffee bag photos. AI extracts roaster, origin, variety, process method, roast level, and more.
- Bilingual Interface: Full English and Traditional Chinese support with auto-detection
- Notion OAuth: Multi-user auth where each user gets their own isolated Notion database
- Guest Mode: Browse the owner's collection without logging in (read-only)
- Advanced Filtering: Filter by roast level, rating, origin with dynamic sort
- AI Video Pipeline: Converts screen recordings into product demos (Remotion + Gemini + ElevenLabs) at $0 cost
- Mobile-First: Dual photo upload (camera + file picker), responsive 2-5 column grid
| Layer | Provider | Why This Provider |
|---|---|---|
| OCR | Tesseract.js (client-side) | Zero API cost, browser-native, multilingual |
| Data extraction | Groq AI (Llama 3.1 8B) | Fast inference, JSON mode, structured output |
| Video generation | Gemini Pro (Google) | Multimodal, long context for video scripts |
| Voice synthesis | ElevenLabs | Natural speech for product demos |
| Fallback | Regex patterns | Graceful degradation when AI fails |
- Frontend: React 18, TypeScript, Vite, TanStack Query, shadcn/ui, Tailwind CSS
- Backend: Express.js, TypeScript, Notion SDK, Cloudinary
- Auth: Notion OAuth 2.0 with session management
- Deployment: Render.com with Cloudinary for persistent photo storage
- Node.js 18+
- Groq API key (groq.com)
- Notion Internal Integration (notion.so/my-integrations)
- Google Maps API key (optional)
# Clone the repository
git clone https://github.com/YFC-ophey/The-Bean-Keeper.git
cd the-bean-keeper
# Install dependencies
npm install
# Copy environment variables
cp .env.example .env
# Edit .env with your API keys
# Run development server
npm run devVisit http://localhost:5000
Create a .env file with:
# Required
GROQ_API_KEY=your_groq_api_key
NOTION_API_KEY=your_notion_internal_integration_token
NOTION_DATABASE_ID=your_notion_database_id
# Optional
VITE_GOOGLE_MAPS_API_KEY=your_google_maps_key
PORT=5000See .env.example for details.
# Create a page in Notion and get its ID
# Then run:
npx tsx create-database.ts <notion-page-id>See NOTION_DATABASE_STRUCTURE.md for the complete schema.
Full deployment guide: DEPLOYMENT.md
Quick start: DEPLOY_QUICK_START.md
# 1. Push to GitHub
git add .
git commit -m "Ready for deployment"
git push
# 2. Create web service on Render
# 3. Connect GitHub repository
# 4. Add environment variables
# 5. Deploy!Auto-deploys on every git push to main branch.
# Development server
npm run dev
# Type checking
npm run check
# Production build
npm run build
# Production server
npm start
# Test Groq AI extraction
npx tsx test-groq.ts
# Test Notion connection
npx tsx test-notion-setup.tsMobile-first grid layout with Instagram-style coffee cards
Upload photo β AI extracts roaster, origin, variety, process, roast level
Toggle between English and Traditional Chinese
Track your coffee journey with collection insights
The-Bean-Keeper/
βββ client/ # React frontend
β βββ src/
β β βββ components/ # UI components
β β βββ pages/ # Page components
β β βββ i18n/ # Translations (EN/ZH)
β β βββ lib/ # API client
β βββ public/ # Static assets
βββ server/ # Express backend
β βββ index.ts # Server entry
β βββ routes.ts # API endpoints
β βββ groq.ts # Groq AI client
β βββ notion.ts # Notion operations
β βββ notion-storage.ts # Storage layer
βββ shared/ # Shared types
β βββ schema.ts # TypeScript + Zod schemas
βββ DEPLOYMENT.md # Full deployment guide
βββ DEPLOY_QUICK_START.md # Quick deployment steps
βββ CLAUDE.md # Development guide
- OCR: Tesseract.js extracts raw text from photos
- AI Processing: Groq Llama 3.1 8B structures the data
- Smart Detection: Automatically identifies roast level, origin, variety
- Graceful Fallback: Regex extraction if AI fails
- Full bilingual support (EN + ZH ηΉι«δΈζ)
- Automatic language detection
- LocalStorage persistence
- 6 translation namespaces
- Responsive 2-5 column grid
- Dual photo upload methods
- Touch-optimized interactions
- Vintage coffee journal aesthetic
- Advanced filtering (roast, rating, origin)
- Multiple sort options
- Duplicate detection
- Statistics dashboard
This is a portfolio project, but suggestions are welcome!
MIT License - See LICENSE file for details
Ophelia Chen
- Portfolio: Coming Soon
- LinkedIn: https://www.linkedin.com/in/opheliandata/
- GitHub: @YFC-ophey
- Claude Code - My Fav Vibe Coding Tool
- Groq - Lightning-fast AI inference
- Notion - Database and API
- Tesseract.js - OCR engine
- shadcn/ui - UI components
- Clash Display - Typography
- Render - Cloud Application Platform
- Cloudinary - Image and Media API Platform
Built with β and AI | Powered by Groq + Notion
