Back to Projects
Personal Project

Document Processing AI

Production-grade RAG API — upload documents, query them in natural language

PythonFastAPIFAISSSQLiteRAGLLM AbstractionREST API

Problem

Enterprise document workflows are slow and require manual reading. This API lets users ingest PDFs and plain-text files, then ask natural-language questions against them — no manual indexing, no prompt injection risk from raw document dumping. Built with a production-first mindset: clean REST contract, persistent metadata, and a pluggable LLM backend.

Architecture

Client (HTTP)
FastAPI REST Layer
Chunking Engine (character-based, word-boundary snap)
FAISS Vector Index (embedding + similarity search)
↓ ↓
SQLite (document metadata) | Abstracted LLM Provider

API Endpoints

  • POST/docs — ingest PDF or plain-text file into the vector index
  • POST/query — ask a natural-language question, returns grounded answer
  • GET/docs/{id} — retrieve document metadata and status
  • GET/health — availability check

Technical Challenges

  • Chunking Strategy: Implemented character-based splitting with word-boundary snapping to preserve semantic coherence and avoid mid-word cuts that degrade embedding quality.
  • LLM Provider Abstraction: Designed a pluggable provider interface so the backend LLM can be swapped (OpenAI, local models, etc.) without touching the API or chunking logic.
  • FAISS Integration: Managed in-memory FAISS index lifecycle alongside SQLite metadata for durability across restarts.
  • Production Hardening: Built with proper error handling, input validation, health endpoints, and a test suite — not a prototype.

Impact

Demonstrates production RAG engineering beyond tutorials: real chunking decisions, real vector index management, real provider abstraction, and a proper REST API contract. Applicable to enterprise document Q&A, legal review, internal knowledge bases, and any domain with high document volume.