AI-Powered Document Assistant with RAG
Production-ready Retrieval-Augmented Generation (RAG) system with multi-format document parsing (PDF, DOCX, CSV with link extraction), intelligent semantic chunking, and vector search using Qdrant. Features Azure OpenAI integration with conversation history, content filtering, adaptive garbage collection, and API monitoring with performance alerts. Built with async Quart backend, multiple embedding providers (SentenceTransformers, Azure, FastEmbed), and comprehensive lifecycle management.
Python Quart Qdrant Azure OpenAI RAG Vector Search NLP
Document Intelligence Pipeline
Local document processing pipeline for PDF, DOCX and scanned images: OCR (Tesseract), text extraction, table detection & extraction, key-value parsing, entity recognition (spaCy), layout analysis and summarization. Results are produced as structured JSON suitable for downstream ingestion (data lakes, BI, search indexes). Repository is private — contact guch79@gmail.com for access and commercial options.
Python Quart OCR Tesseract spaCy PDF processing NLP
Professional Portfolio & CV Generator
Fully asynchronous Quart web application with auto-generated PDF CV from dynamic content. Features intelligent HTML-to-PDF conversion with structured data extraction, professional formatting, and clickable links. Built with modern async Python patterns.
Quart Uvicorn FPDF2 Async Python Jinja2
Apology-as-a-Service (MCP Server)
A live Model Context Protocol (MCP) server that provides context-aware crisis communication for AI agents. Test it live right here with the 'Generate Live Apology' button, or download the config to connect your own agent. Features multiple severity levels, styles (including Haiku), and SSE support.
Python MCP Protocol SSE Docker Async FastMCP
ETL Data Pipeline - Dataverse to SQL Server
Complete ETL pipeline for extracting data from Microsoft Dataverse, transforming with business logic, and loading to SQL Server. Includes fake data generation with Faker for testing before production deployment. Features parallel processing, connection pooling, and circuit breakers.
Python Pandas SQLAlchemy Dataverse Faker