Visual LLM Structured Output Validation with Sparrow
I explain how Sparrow validates the structured output of visual LLMs to ensure it complies with the JSON schema provided in the query. This process helps prevent errors and hallucinations generated by...
View ArticleBatch Inference with Qwen2 Vision LLM (Sparrow)
I'm explaining several hints how to optimize Qwen2 Visual LLM performance for batch processing.
View ArticleSparrow Apple MLX Backend on Mac Mini M4 (Qwen2 72B 4bit)
I show how I’m running the Qwen2 72B 4bit model locally on a Mac Mini M4 for Sparrow’s backend. MLX (and MLX-VLM) is the main platform I’m using for local data extraction in Sparrow.
View ArticleStructured Output from Multipage PDF with Sparrow (Qwen2 Vision LLM and MLX)
I explain how multipage PDFs are handled in Sparrow to extract structured data in a single call.
View ArticleStreamlined Table Data Extraction with Sparrow | Table Transformer, Qwen2 VL,...
Learn how to streamline table data extraction with Sparrow, Table Transformer, Qwen2 VL, and MLX on the Mac Mini M4 Pro. Simplify your workflow and get accurate results!
View ArticleStateless MLX Inference with FastAPI in Sparrow
I show how to run inference with MLX in stateless mode, when loaded model is released after inference completes. This is useful when inference requests are less frequent and it helps to reclaim...
View ArticleVision LLM Structured Output with Sparrow
I show how Sparrow UI Shell works with both image and PDF docs to process and extract structured data with Vision LLM (Qwen2) in the MLX backend.
View ArticleApple MLX Vision LLM Server with Ngrok, FastAPI and Sparrow
I show how I run Apple MLX backend on my local Mac Mini M4 Pro 64GB and access it from the Web through Ngrok, with automatically provisioned HTTPS certificate.
View ArticleImproving Qwen-VL Structured Output with Image Cropping
Explaining how I'm improving structured output results from Qwen-VL with image cropping in Sparrow.
View ArticleBuilding Web UI Apps with Python Gradio – A Java Developer’s Perspective
I explain building Web UI apps with Python Gradio framework. I used to work with Java in the past and was building enterprise Web UI apps with JSF. Based on this experience I can tell, Gradio is...
View ArticleStructured Data Extraction with Sparrow Agent: Vision LLM & Prefect in Action
Discover how to streamline your data extraction process with Sparrow Agent! In this tutorial, I showcase how Sparrow Agent leverages Vision LLM to intelligently handle complex data tasks, while Prefect...
View ArticleQuerying Non Existing Fields with Qwen2.5 Vision LLM
I describe how Sparrow helps to query non existing fields with Qwen2.5 Vision LLM. Running it locally with MLX and MLX-VLM.
View ArticleBuilding AI Agent for Local Structured JSON Output
I explain key steps of building AI agent to process document and extract structured JSON data locally. I'm running it with Sparrow and using Qwen VL model for vision processing backend and OCR. The...
View ArticleTemporary Files Cleaner for Gradio Web App
Learn how to implement an automatic temporary file cleanup solution for Gradio web applications. This tutorial shows you how to prevent disk space issues by periodically removing old upload files and...
View ArticleOracle DB 23ai Free Connection Pool in Python
I describe how to connect to Oracle DB from Python. I explain why DB connection pool is important for better performance. Connection is done through thin oracledb mode, without installing Oracle Client.
View ArticleExtract Structured Data from Documents with Sparrow (Free Tier Available)
I built Sparrow for document data extraction 🚀 It's fully open-source and runs locally on your machine You can extract structured data from any document using powerful Mistral 24B 8bit and Qwen 2.5 72B...
View ArticleDashboard with Gradio Python
This video showcases the Sparrow dashboard, where you can view statistics on document data extraction events processed by Sparrow. This elegant dashboard is built with Python using Gradio, a...
View ArticleRunning Vision Models on Apple Silicon with MLX-VLM
I show and explain how to run Qwen and Mistral vision models on Apple Silicon with MLX-VLM. I share technical tips about how to run both models and show how to pass query prompt.
View ArticleVision LLM on Mac Mini M4 Pro: Real-World MLX Performance
I discuss the real-world MLX performance of Sparrow for structured data extraction with public access. The current Sparrow online instance runs on a Mac Mini M4 Pro with 64GB of memory. On average, it...
View ArticleLocal LLM Instruction Processing with Sparrow
I explain how to execute instructions with a payload using a local LLM. This is useful when you want to process your data with an LLM and provide contextual instructions, specifying the desired outcome...
View ArticleLLM Microservice with Instruction Calling
I describe the idea of implementing interaction with LLM through a concept of microservice with instruction calling. This works great for enterprise application use cases, such as data validation,...
View ArticleStructured Data Annotation with Qwen2.5 VL and MLX-VLM
Qwen2.5 VL can provide bounding box coordinates and confidence values for extracted structured data. This is useful for visual data review and reporting. I will explain with a practical example what...
View ArticleBox Annotations in Sparrow for Structured Data Extraction
Check out my video on Box Annotations in Sparrow for Structured Data Extraction! I’ll show you how the Qwen2.5 vision model pulls bounding box annotations from images based on what you need. Plus,...
View ArticlePaddleOCR 3.0: Supercharge Your AI
I upgraded to PaddleOCR 3.0 and explain the new PaddleOCR API integration. My goal is to integrate OCR result output with Vision LLM processing to enhance large-scale, structured table data output.
View ArticleSolving Vision LLM Number Formatting Issues Using PaddleOCR and Sparrow
Discover how to fix number formatting errors in vision LLMs like Mistral! In this video, I show how Mistral misreads "56,000" as "56000" and how combining PaddleOCR’s text extraction with Sparrow’s...
View Article