Creating Tombstone Records Using kafka-console-producer.sh: A Quick Guide

A practical guide to creating tombstone records for Kafka compacted topics using the kafka-console-producer.sh command-line tool with the null marker feature

April 22, 2025 · 3 min · 565 words · Joel Hanson

Build Custom Kafka Connectors Fast with This Open-Source Template

Apache Kafka is a powerful distributed event streaming platform, and Kafka Connect makes it easy to integrate Kafka with external systems. While many pre-built connectors exist, real-world applications often need custom connectors tailored for proprietary systems, custom logic, or advanced error handling. That’s where this production-ready template comes in—it removes the boilerplate and gives you everything you need to build, test, and deploy connectors with ease.

April 14, 2025 · 4 min · 737 words · Joel Hanson

Filtering Tombstone Records in Kafka Connect

Kafka Connect provides a flexible way to process streaming data using Single Message Transforms (SMTs). If you need to filter out tombstone records (records with null values), you should use the generic Filter transformation along with the RecordIsTombstone predicate. Here’s the correct configuration: # Define the predicate to detect tombstone records (i.e., records with null values) predicates=dropTombstone predicates.dropTombstone.type=org.apache.kafka.connect.transforms.predicates.RecordIsTombstone # Configure the Filter transformation to drop records that match the predicate transforms=dropTombstone transforms.dropTombstone.type=org.apache.kafka.connect.transforms.Filter transforms.dropTombstone.predicate=dropTombstone Explanation What is a Predicate? A predicate in Kafka Connect is a condition that evaluates whether a given record meets certain criteria. It returns either true or false. If true, the transformation (such as filtering) is applied. In this case, the predicate named dropTombstone uses the built-in class RecordIsTombstone, which evaluates to true when a record’s value is null. ...

March 11, 2025 · 2 min · 295 words · Joel Hanson

Server-Sent Events: Build Real-Time Web Apps with Minimal Code

In this technical deep-dive, we unravel the power of Server-Sent Events (SSE), a game-changing web technology that simplifies real-time communication. Developers often struggle with complex, resource-intensive methods for creating live updates, but SSE offers an elegant solution that’s both lightweight and powerful.

December 7, 2024 · 2 min · 337 words · Joel Hanson

Configuring Kafka Connector with Docker: A Step-by-Step Guide

In this tutorial, we’ll walk through the process of setting up a Kafka connector using Docker and docker-compose.yml. We’ll focus on configuring a file connector, which is useful for reading data from files and writing data to files using Kafka. Link to repository Prerequisites Docker and Docker Compose installed on your system Basic understanding of Kafka and Docker concepts Project Structure Before we begin, let’s look at the files we’ll be working with: ...

October 12, 2024 · 5 min · 1041 words · Joel Hanson

Setting Up a Kafka Cluster Without Zookeeper Using Docker

Introduction In this post, we will walk through the process of setting up an Apache Kafka cluster without using Zookeeper, leveraging Kafka’s KRaft mode for metadata management. Kafka no longer requires a Zookeeper instance, simplifying the cluster setup. We will use Docker to deploy this Kafka cluster with the latest version of Kafka. Prerequisites Ensure that you have the following: Docker installed Docker compose installed Basic understanding of Kafka concepts (e.g., brokers, partitions) Step 1: Kafka Cluster Setup Let’s use the following Docker Compose configuration to set up the Kafka broker in KRaft mode: ...

October 5, 2024 · 4 min · 685 words · Joel Hanson

Online Machine Learning: Real-time Image Classification with CIFAR10 Dataset and Kafka

This blog post explores online machine learning through a practical example using the CIFAR10 dataset and Apache Kafka.

October 2, 2024 · 7 min · 1360 words · Joel Hanson

Supercharging Your AI: Setting Up RAG with PostgreSQL and pgvector

Learn how to implement a powerful Retrieval-Augmented Generation (RAG) system using PostgreSQL and pgvector. This comprehensive guide covers everything from setting up a custom PostgreSQL Docker image to creating a fully functional RAG query system with vector embeddings and language model inference.

September 30, 2024 · 4 min · 761 words · Joel Hanson

Git Aliases to Supercharge Your Workflow

Git aliases are a powerful workflow tool that create shortcuts to frequently used Git commands.

March 21, 2024 · 12 min · 2384 words · Joel Hanson

Mastering the Find Command: Unleashing Unix File Management Power

Discover the unparalleled capabilities of the ‘find’ command in Unix systems. From basic file searches to advanced manipulations, mastering ‘find’ empowers users to efficiently navigate, search, and manage files. With practical examples and powerful techniques, this blog unveils the full potential of ‘find’, transforming your Unix experience into one of seamless productivity and control. Unlock the power of Unix file management with the mastery of the ‘find’ command.

February 28, 2024 · 6 min · 1145 words · Joel Hanson