Andrej Baranovskij Blog

↧

Visual LLM Structured Output Validation with Sparrow

November 17, 2024, 7:24 am

I explain how Sparrow validates the structured output of visual LLMs to ensure it complies with the JSON schema provided in the query. This process helps prevent errors and hallucinations generated by...

View Article

Batch Inference with Qwen2 Vision LLM (Sparrow)

November 25, 2024, 12:38 am

I'm explaining several hints how to optimize Qwen2 Visual LLM performance for batch processing.

View Article

Sparrow Apple MLX Backend on Mac Mini M4 (Qwen2 72B 4bit)

December 3, 2024, 7:12 am

I show how I’m running the Qwen2 72B 4bit model locally on a Mac Mini M4 for Sparrow’s backend. MLX (and MLX-VLM) is the main platform I’m using for local data extraction in Sparrow.

View Article

Structured Output from Multipage PDF with Sparrow (Qwen2 Vision LLM and MLX)

December 9, 2024, 12:23 am

I explain how multipage PDFs are handled in Sparrow to extract structured data in a single call.

View Article

Streamlined Table Data Extraction with Sparrow | Table Transformer, Qwen2 VL,...

December 17, 2024, 12:08 am

Learn how to streamline table data extraction with Sparrow, Table Transformer, Qwen2 VL, and MLX on the Mac Mini M4 Pro. Simplify your workflow and get accurate results!

View Article

Stateless MLX Inference with FastAPI in Sparrow

December 23, 2024, 6:11 am

I show how to run inference with MLX in stateless mode, when loaded model is released after inference completes. This is useful when inference requests are less frequent and it helps to reclaim...

View Article

Vision LLM Structured Output with Sparrow

January 14, 2025, 4:06 am

I show how Sparrow UI Shell works with both image and PDF docs to process and extract structured data with Vision LLM (Qwen2) in the MLX backend.

View Article

Apple MLX Vision LLM Server with Ngrok, FastAPI and Sparrow

January 20, 2025, 12:01 am

I show how I run Apple MLX backend on my local Mac Mini M4 Pro 64GB and access it from the Web through Ngrok, with automatically provisioned HTTPS certificate.

View Article

Improving Qwen-VL Structured Output with Image Cropping

January 28, 2025, 12:07 am

Explaining how I'm improving structured output results from Qwen-VL with image cropping in Sparrow.

View Article

Building Web UI Apps with Python Gradio – A Java Developer’s Perspective

February 4, 2025, 4:52 am

I explain building Web UI apps with Python Gradio framework. I used to work with Java in the past and was building enterprise Web UI apps with JSF. Based on this experience I can tell, Gradio is...

View Article

Structured Data Extraction with Sparrow Agent: Vision LLM & Prefect in Action

February 10, 2025, 5:15 am

Discover how to streamline your data extraction process with Sparrow Agent! In this tutorial, I showcase how Sparrow Agent leverages Vision LLM to intelligently handle complex data tasks, while Prefect...

View Article

Querying Non Existing Fields with Qwen2.5 Vision LLM

March 3, 2025, 5:16 am

I describe how Sparrow helps to query non existing fields with Qwen2.5 Vision LLM. Running it locally with MLX and MLX-VLM.

View Article

Building AI Agent for Local Structured JSON Output

March 12, 2025, 5:19 am

I explain key steps of building AI agent to process document and extract structured JSON data locally. I'm running it with Sparrow and using Qwen VL model for vision processing backend and OCR. The...

View Article

Temporary Files Cleaner for Gradio Web App

March 17, 2025, 12:20 am

Learn how to implement an automatic temporary file cleanup solution for Gradio web applications. This tutorial shows you how to prevent disk space issues by periodically removing old upload files and...

View Article

Oracle DB 23ai Free Connection Pool in Python

March 25, 2025, 12:56 pm

I describe how to connect to Oracle DB from Python. I explain why DB connection pool is important for better performance. Connection is done through thin oracledb mode, without installing Oracle Client.

View Article

Extract Structured Data from Documents with Sparrow (Free Tier Available)

March 30, 2025, 11:49 pm

I built Sparrow for document data extraction 🚀 It's fully open-source and runs locally on your machine You can extract structured data from any document using powerful Mistral 24B 8bit and Qwen 2.5 72B...

View Article

Dashboard with Gradio Python

April 15, 2025, 12:31 pm

This video showcases the Sparrow dashboard, where you can view statistics on document data extraction events processed by Sparrow. This elegant dashboard is built with Python using Gradio, a...

View Article

Running Vision Models on Apple Silicon with MLX-VLM

April 22, 2025, 11:32 am

I show and explain how to run Qwen and Mistral vision models on Apple Silicon with MLX-VLM. I share technical tips about how to run both models and show how to pass query prompt.

View Article

Vision LLM on Mac Mini M4 Pro: Real-World MLX Performance

April 28, 2025, 12:17 am

I discuss the real-world MLX performance of Sparrow for structured data extraction with public access. The current Sparrow online instance runs on a Mac Mini M4 Pro with 64GB of memory. On average, it...

View Article

Local LLM Instruction Processing with Sparrow

May 4, 2025, 11:28 pm

I explain how to execute instructions with a payload using a local LLM. This is useful when you want to process your data with an LLM and provide contextual instructions, specifying the desired outcome...

View Article

LLM Microservice with Instruction Calling

May 13, 2025, 12:05 am

I describe the idea of implementing interaction with LLM through a concept of microservice with instruction calling. This works great for enterprise application use cases, such as data validation,...

View Article

Structured Data Annotation with Qwen2.5 VL and MLX-VLM

May 18, 2025, 11:31 pm

Qwen2.5 VL can provide bounding box coordinates and confidence values for extracted structured data. This is useful for visual data review and reporting. I will explain with a practical example what...

View Article

Box Annotations in Sparrow for Structured Data Extraction

May 26, 2025, 4:47 am

Check out my video on Box Annotations in Sparrow for Structured Data Extraction! I’ll show you how the Qwen2.5 vision model pulls bounding box annotations from images based on what you need. Plus,...

View Article

PaddleOCR 3.0: Supercharge Your AI

June 2, 2025, 11:19 pm

I upgraded to PaddleOCR 3.0 and explain the new PaddleOCR API integration. My goal is to integrate OCR result output with Vision LLM processing to enhance large-scale, structured table data output.

View Article

Solving Vision LLM Number Formatting Issues Using PaddleOCR and Sparrow

June 10, 2025, 5:00 am

Discover how to fix number formatting errors in vision LLMs like Mistral! In this video, I show how Mistral misreads "56,000" as "56000" and how combining PaddleOCR’s text extraction with Sparrow’s...

View Article