Skip to content

AI Executive Assistant

Project Summary

Type: Portfolio / Demo Project
Focus: Production-Grade AI Pipeline

Key Features:

  • End-to-end AI pipeline with email ingestion and classification (95%+ accuracy)
  • Advanced retrieval: Contextual Retrieval, Self-Query, Reciprocal Rank Fusion
  • Custom workflow graphs with self-correction mechanisms
  • Multi-model orchestration with type-safe structured outputs
  • Full observability pipeline (token usage, latency, quality tracking)
  • Security layer with PII detection and prompt injection protection

This project demonstrates a production-ready AI system that ingests emails, classifies them with high accuracy, and retrieves context from a 10,000+ chunk knowledge base using advanced RAG techniques. Built from scratch with custom workflow orchestration similar to LangGraph, showcasing enterprise-grade AI architecture.

The Problem

Most RAG implementations are "weekend demos"—they work in a notebook but fall apart in production. Real enterprise deployments need:

  • Robust retrieval that handles ambiguous queries
  • Multiple retrieval strategies that complement each other
  • Scalable architecture that doesn't break under load
  • Clear separation of concerns for maintainability

Architecture

flowchart TB
    subgraph ingestion [Email Ingestion & Processing]
        emails[Email Input] --> classifier[Email Classifier<br/>95%+ Accuracy]
        classifier --> chunker[Smart Chunker]
        chunker --> enricher[Context Enricher]
        enricher --> embedder[Embedding Generator]
        embedder --> vectordb[(pgVector<br/>10,000+ Chunks)]
    end

    subgraph retrieval [Advanced Retrieval]
        query[User Query] --> contextual[Contextual Retrieval]
        query --> selfquery[Self-Query]
        query --> semantic[Semantic Search]
        contextual --> rrf[RRF Fusion]
        selfquery --> rrf
        semantic --> rrf
        vectordb --> contextual
        vectordb --> selfquery
        vectordb --> semantic
    end

    subgraph orchestration [Workflow Orchestration]
        rrf --> workflow[Custom Workflow Graph<br/>Self-Correction]
        workflow --> multi[Multi-Model<br/>Orchestration]
        multi --> structured[Type-Safe<br/>Structured Outputs]
    end

    subgraph observability [Observability Layer]
        structured --> langfuse[Langfuse<br/>Token Usage<br/>Latency<br/>Quality Tracking]
    end

    subgraph security [Security Layer]
        langfuse --> pii[PII Detection]
        langfuse --> injection[Prompt Injection<br/>Protection LLM Guard]
        pii --> response[Response]
        injection --> response
    end

Technical Approach

Email Classification Pipeline

The system ingests emails and classifies them with 95%+ accuracy using a multi-stage classification pipeline. This ensures that only relevant emails trigger the RAG retrieval process, reducing noise and improving response quality.

Advanced Retrieval Techniques

The system employs three complementary retrieval strategies:

  • Contextual Retrieval: Enriches document chunks with context about their position in the broader document structure, dramatically improving retrieval accuracy for complex queries
  • Self-Query: Allows the system to decompose complex queries into structured filters and semantic search components
  • Reciprocal Rank Fusion (RRF): Combines results from multiple retrieval strategies, giving better results than any single method alone

Custom Workflow Orchestration

Built custom workflow graphs from scratch (similar to LangGraph) with self-correction mechanisms. The system can detect when initial outputs don't meet quality thresholds and automatically retry with adjusted parameters or alternative strategies.

Multi-Model Orchestration

The system orchestrates multiple LLM calls with type-safe structured outputs, ensuring consistent data formats and enabling complex multi-step reasoning workflows.

Production Architecture

  • Clean separation between ingestion, retrieval, orchestration, and generation
  • Comprehensive logging for debugging and monitoring
  • Graceful error handling at every layer with automatic retries
  • Configuration-driven behavior for easy deployment

Results: Naive vs Advanced Approach

Metric Naive RAG This Implementation
Email classification accuracy ~70% 95%+
Knowledge base size < 1,000 chunks 10,000+ chunks
Retrieval accuracy on ambiguous queries ~60% ~85%
Retrieval strategies Single method 3 methods + RRF fusion
Error recovery Crashes Self-correction mechanisms
Observability Basic logging Full pipeline tracking
Security None PII detection + injection protection
Production readiness Manual deployment Docker + CI/CD ready

Observability

Full observability pipeline built with Langfuse tracks:

  • Token Usage: Monitor API costs and usage patterns across all LLM calls
  • Latency: Track response times at each stage of the pipeline
  • Quality Metrics: Measure retrieval relevance, classification accuracy, and response quality
  • Error Tracking: Comprehensive error logging with context for debugging

This observability layer enables data-driven optimization and cost management, critical for production AI systems.

Security

Enterprise-grade security layer protects against common AI vulnerabilities:

  • PII Detection: Automatically detects and redacts personally identifiable information before processing
  • Prompt Injection Protection: LLM Guard integration prevents malicious prompt injection attacks
  • Input Validation: Strict validation at every pipeline stage

These security measures ensure the system can handle sensitive enterprise data safely.

Tech Stack

Python FastAPI Celery OpenAI API PostgreSQL pgVector Docker Langfuse LLM Guard

Video Walkthrough

Coming Soon

A 5-10 minute video demo walking through the architecture and showing the system in action. Check back soon!

Key Learnings

This project reinforced that the gap between "working demo" and "production system" is where most AI projects fail. The techniques I've implemented here—contextual retrieval, RRF, robust error handling—are exactly what enterprises need but rarely get from typical AI vendors.

  • Want something like this for your company?


    I build production-ready RAG systems for scale-up companies. Let's discuss your AI challenges.

    Book Free Intro Call