Quantcast
Channel: Andrej Baranovskij Blog
Browsing latest articles
Browse All 704 View Live

Document Querying with Qwen2-VL-7B and JSON Output

In this video, I demonstrate how to perform document queries using Qwen2-VL-7B. By simplifying field names, we streamline the prompts, making them more efficient and reusable across different...

View Article


Running Qwen2 Vision LLM on Hugging Face ZeroGPU API

Explaining my experience running Sparrow Parse with Qwen2 Vision LLM inference on Hugging Face ZeroGPU instance.  

View Article


Sparrow Parse Invoice Query with Vision LLM

New Sparrow Agent - Sparrow Parse, works with Qwen2 Vision LLM. What it does: 1. Accepts query with JSON schema, this helps to solve few things at once - provides JSON structure for LLM to generate...

View Article

Sparrow Parse Vision LLM FastAPI Endpoint

Sparrow provides an API for accessing the Sparrow Parse agent, allowing you to run document extraction workflows directly from your existing systems. It helps simplify how data is pulled from documents...

View Article

Qwen2-VL Performance Boost

I share performance-boosting tips based on my experience using Qwen2-VL in production.  

View Article


Structured Output Example with Sparrow UI Shell

Structured output is all you need. I deployed a Sparrow demo UI with Gradio to demonstrate the output Sparrow can produce by running a JSON schema query. You can see examples for the Bonds table, Lab...

View Article

Extracting Financial Market Stock Data from Images with Vision LLM

In this video, I demonstrate how to extract financial market stock data from images using the powerful Vision LLM Qwen2, all within a Gradio interface. This setup allows quick and easy extraction of...

View Article

Visual LLM Structured Output Validation with Sparrow

I explain how Sparrow validates the structured output of visual LLMs to ensure it complies with the JSON schema provided in the query. This process helps prevent errors and hallucinations generated by...

View Article


Batch Inference with Qwen2 Vision LLM (Sparrow)

I'm explaining several hints how to optimize Qwen2 Visual LLM performance for batch processing.  

View Article


Sparrow Apple MLX Backend on Mac Mini M4 (Qwen2 72B 4bit)

I show how I’m running the Qwen2 72B 4bit model locally on a Mac Mini M4 for Sparrow’s backend. MLX (and MLX-VLM) is the main platform I’m using for local data extraction in Sparrow.  

View Article

Structured Output from Multipage PDF with Sparrow (Qwen2 Vision LLM and MLX)

I explain how multipage PDFs are handled in Sparrow to extract structured data in a single call.  

View Article

Streamlined Table Data Extraction with Sparrow | Table Transformer, Qwen2 VL,...

Learn how to streamline table data extraction with Sparrow, Table Transformer, Qwen2 VL, and MLX on the Mac Mini M4 Pro. Simplify your workflow and get accurate results!  

View Article

Stateless MLX Inference with FastAPI in Sparrow

I show how to run inference with MLX in stateless mode, when loaded model is released after inference completes. This is useful when inference requests are less frequent and it helps to reclaim...

View Article


Vision LLM Structured Output with Sparrow

I show how Sparrow UI Shell works with both image and PDF docs to process and extract structured data with Vision LLM (Qwen2) in the MLX backend.  

View Article

Apple MLX Vision LLM Server with Ngrok, FastAPI and Sparrow

I show how I run Apple MLX backend on my local Mac Mini M4 Pro 64GB and access it from the Web through Ngrok, with automatically provisioned HTTPS certificate.  

View Article


Improving Qwen-VL Structured Output with Image Cropping

Explaining how I'm improving structured output results from Qwen-VL with image cropping in Sparrow.  

View Article

Building Web UI Apps with Python Gradio – A Java Developer’s Perspective

I explain building Web UI apps with Python Gradio framework. I used to work with Java in the past and was building enterprise Web UI apps with JSF. Based on this experience I can tell, Gradio is...

View Article


Structured Data Extraction with Sparrow Agent: Vision LLM & Prefect in Action

Discover how to streamline your data extraction process with Sparrow Agent! In this tutorial, I showcase how Sparrow Agent leverages Vision LLM to intelligently handle complex data tasks, while Prefect...

View Article

Querying Non Existing Fields with Qwen2.5 Vision LLM

I describe how Sparrow helps to query non existing fields with Qwen2.5 Vision LLM. Running it locally with MLX and MLX-VLM.  

View Article

Building AI Agent for Local Structured JSON Output

I explain key steps of building AI agent to process document and extract structured JSON data locally. I'm running it with Sparrow and using Qwen VL model for vision processing backend and OCR. The...

View Article

Temporary Files Cleaner for Gradio Web App

Learn how to implement an automatic temporary file cleanup solution for Gradio web applications. This tutorial shows you how to prevent disk space issues by periodically removing old upload files and...

View Article


Oracle DB 23ai Free Connection Pool in Python

I describe how to connect to Oracle DB from Python. I explain why DB connection pool is important for better performance. Connection is done through thin oracledb mode, without installing Oracle Client. 

View Article


Extract Structured Data from Documents with Sparrow (Free Tier Available)

I built Sparrow for document data extraction 🚀 It's fully open-source and runs locally on your machine You can extract structured data from any document using powerful Mistral 24B 8bit and Qwen 2.5 72B...

View Article

Dashboard with Gradio Python

This video showcases the Sparrow dashboard, where you can view statistics on document data extraction events processed by Sparrow. This elegant dashboard is built with Python using Gradio, a...

View Article

Running Vision Models on Apple Silicon with MLX-VLM

I show and explain how to run Qwen and Mistral vision models on Apple Silicon with MLX-VLM. I share technical tips about how to run both models and show how to pass query prompt. 

View Article

Browsing latest articles
Browse All 704 View Live