Quantcast
Channel: Andrej Baranovskij Blog
Browsing all 705 articles
Browse latest View live

Skipper MLOps Debugging and Development on Your Local Machine

I explain how to stop some of the Skipper MLOps services running in Docker and debug/develop these services code locally. This improves development workflow. There is no need to deploy code change to...

View Article


Invoice Data Processing with Mistral LLM on Local CPU

I explain the solution to extract invoice document fields with open-source LLM Mistral. It runs on CPU and doesn't require Cloud machine. I'm using Mistral 7B LLM model, Langchain, Ctransformers and...

View Article


Invoice Data Processing with Llama2 13B LLM RAG on Local CPU [Weaviate,...

I explained how to set up local LLM RAG to process invoice data with Llama2 13B. Based on my experiments, Llama2 13B works better with tabular data compared to Mistral 7B model. This example presents a...

View Article

Structured JSON Output from LLM RAG on Local CPU [Weaviate, Llama.cpp, Haystack]

I explain how to get structured JSON output from LLM RAG running using Haystack API on top of Llama.cpp. Vector embeddings are stored in Weaviate database, the same as in my previous video. When...

View Article

JSON Output from Mistral 7B LLM [LangChain, Ctransformers]

I explain how to compose a prompt for Mistral 7B LLM model running with LangChain and Ctransformers to retrieve output as JSON string, without any additional text.  

View Article


Vector Database Impact on RAG Efficiency: A Simple Overview

I explain the importance of Vector DB for RAG implementation. I show with a simple example, how data retrieval from Vector DB could affect LLM performance. Before data is sent to LLM, you should verify...

View Article

Easy-to-Follow RAG Pipeline Tutorial: Invoice Processing with ChromaDB &...

I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready...

View Article

Secure and Private: On-Premise Invoice Processing with LangChain and Ollama RAG

The Ollama desktop tool helps run LLMs locally on your machine. This tutorial explains how I implemented a pipeline with LangChain and Ollama for on-premise invoice processing. Running LLM on-premise...

View Article


Enhancing RAG: LlamaIndex and Ollama for On-Premise Data Extraction

LlamaIndex is an excellent choice for RAG implementation. It provides a perfect API to work with different data sources and extract data. LlamaIndex provides API for Ollama integration. This means we...

View Article


From Text to Vectors: Leveraging Weaviate for local RAG Implementation with...

Weaviate provides vector storage and plays an important part in RAG implementation. I'm using local embeddings from the Sentence Transformers library to create vectors for text-based PDF invoices and...

View Article

Transforming Invoice Data into JSON: Local LLM with LlamaIndex & Pydantic

This is Sparrow, our open-source solution for document processing with local LLMs. I'm running local Starling LLM with Ollama. I explain how to get structured JSON output with LlamaIndex and dynamic...

View Article

FastAPI and LlamaIndex RAG: Creating Efficient APIs

FastAPI works great with LlamaIndex RAG. In this video, I show how to build a POST endpoint to execute inference requests for LlamaIndex. RAG implementation is done as part of Sparrow data extraction...

View Article

JSON Output with Notus Local LLM [LlamaIndex, Ollama, Weaviate]

In this video, I show how to get JSON output from Notus LLM running locally with Ollama. JSON output is generated with LlamaIndex using the dynamic Pydantic class approach.  

View Article


LLM Structured Output with Local Haystack RAG and Ollama

Haystack 2.0 provides functionality to process LLM output and ensure proper JSON structure, based on predefined Pydantic class. I show how you can run this on your local machine, with Ollama. This is...

View Article

Local LLM RAG Pipelines with Sparrow Plugins [Python Interface]

There are many tools and frameworks around LLM, evolving and improving daily. I added plugin support in Sparrow to run different pipelines through the same Sparrow interface. Each pipeline can be...

View Article


Extracting Invoice Structured Output with Haystack and Ollama Local LLM

I implemented Sparrow agent with Haystack structured output functionality to extract invoice data. This runs locally through Ollama, using LLM to retrieve key/value pairs data.  

View Article

LLM Agents with Sparrow

I explain new functionality in Sparrow - LLM agents support. This means you can implement independently running agents, and invoke them from CLI or API. This makes it easier to run various LLM related...

View Article


LlamaIndex Multimodal with Ollama [Local LLM]

I describe how to run LlamaIndex Multimodal with local LlaVA LLM through Ollama. Advantage of this approach - you can process image documents with LLM directly, without running through OCR, this should...

View Article

Optimizing Receipt Processing with LlamaIndex and PaddleOCR

LlamaIndex Text Completion function allows to execute LLM request combining custom data and the question, without using Vector DB. This is very useful when processing output from OCR, it simplifies the...

View Article

FastAPI File Upload and Temporary Directory for Stateless API

I explain how to handle file upload with FastAPI and how to process the file by using Python temporary directory. Files placed into temporary directory are automatically removed once request completes,...

View Article
Browsing all 705 articles
Browse latest View live