Back to Projects
Personal Project
Document Processing AI
Production-grade RAG API — upload documents, query them in natural language
PythonFastAPIFAISSSQLiteRAGLLM AbstractionREST API
Problem
Enterprise document workflows are slow and require manual reading. This API lets users ingest PDFs and plain-text files, then ask natural-language questions against them — no manual indexing, no prompt injection risk from raw document dumping. Built with a production-first mindset: clean REST contract, persistent metadata, and a pluggable LLM backend.
Architecture
Client (HTTP)
↓
FastAPI REST Layer
↓
Chunking Engine (character-based, word-boundary snap)
↓
FAISS Vector Index (embedding + similarity search)
↓ ↓
SQLite (document metadata) | Abstracted LLM Provider
API Endpoints
- POST/docs — ingest PDF or plain-text file into the vector index
- POST/query — ask a natural-language question, returns grounded answer
- GET/docs/{id} — retrieve document metadata and status
- GET/health — availability check
Technical Challenges
- Chunking Strategy: Implemented character-based splitting with word-boundary snapping to preserve semantic coherence and avoid mid-word cuts that degrade embedding quality.
- LLM Provider Abstraction: Designed a pluggable provider interface so the backend LLM can be swapped (OpenAI, local models, etc.) without touching the API or chunking logic.
- FAISS Integration: Managed in-memory FAISS index lifecycle alongside SQLite metadata for durability across restarts.
- Production Hardening: Built with proper error handling, input validation, health endpoints, and a test suite — not a prototype.
Impact
Demonstrates production RAG engineering beyond tutorials: real chunking decisions, real vector index management, real provider abstraction, and a proper REST API contract. Applicable to enterprise document Q&A, legal review, internal knowledge bases, and any domain with high document volume.