Personal Project

Document Processing AI

Production-grade RAG API — upload documents, query them in natural language

PythonFastAPIFAISSSQLiteRAGLLM AbstractionREST API

Problem

Enterprise document workflows are slow and require manual reading. This API lets users ingest PDFs and plain-text files, then ask natural-language questions against them — no manual indexing, no prompt injection risk from raw document dumping. Built with a production-first mindset: clean REST contract, persistent metadata, and a pluggable LLM backend.

Architecture

Client (HTTP)

↓

FastAPI REST Layer

↓

Chunking Engine (character-based, word-boundary snap)

↓

FAISS Vector Index (embedding + similarity search)

↓ ↓

SQLite (document metadata) | Abstracted LLM Provider

API Endpoints

POST/docs — ingest PDF or plain-text file into the vector index
POST/query — ask a natural-language question, returns grounded answer
GET/docs/{id} — retrieve document metadata and status
GET/health — availability check

Technical Challenges

Chunking Strategy: Implemented character-based splitting with word-boundary snapping to preserve semantic coherence and avoid mid-word cuts that degrade embedding quality.
LLM Provider Abstraction: Designed a pluggable provider interface so the backend LLM can be swapped (OpenAI, local models, etc.) without touching the API or chunking logic.
FAISS Integration: Managed in-memory FAISS index lifecycle alongside SQLite metadata for durability across restarts.
Production Hardening: Built with proper error handling, input validation, health endpoints, and a test suite — not a prototype.

Impact

Demonstrates production RAG engineering beyond tutorials: real chunking decisions, real vector index management, real provider abstraction, and a proper REST API contract. Applicable to enterprise document Q&A, legal review, internal knowledge bases, and any domain with high document volume.

Repository