Creating Tombstone Records Using kafka-console-producer.sh: A Quick Guide
A practical guide to creating tombstone records for Kafka compacted topics using the kafka-console-producer.sh command-line tool with the null marker feature
Build Custom Kafka Connectors Fast with This Open-Source Template
Apache Kafka is a powerful distributed event streaming platform, and Kafka Connect makes it easy to integrate Kafka with external systems. While many pre-built connectors exist, real-world applications often need custom connectors tailored for proprietary systems, custom logic, or advanced error handling. That’s where this production-ready template comes in—it removes the boilerplate and gives you everything you need to build, test, and deploy connectors with ease.
Filtering Tombstone Records in Kafka Connect
Kafka Connect provides a flexible way to process streaming data using Single Message Transforms (SMTs). If you need to filter out tombstone records (records with null values), you should use the generic Filter transformation along with the RecordIsTombstone predicate. Here’s the correct configuration: # Define the predicate to detect tombstone records (i.e., records with null values) predicates=dropTombstone predicates.dropTombstone.type=org.apache.kafka.connect.transforms.predicates.RecordIsTombstone # Configure the Filter transformation to drop records that match the predicate transforms=dropTombstone transforms.dropTombstone.type=org.apache.kafka.connect.transforms.Filter transforms.dropTombstone.predicate=dropTombstone Explanation What is a Predicate? A predicate in Kafka Connect is a condition that evaluates whether a given record meets certain criteria. It returns either true or false. If true, the transformation (such as filtering) is applied. In this case, the predicate named dropTombstone uses the built-in class RecordIsTombstone, which evaluates to true when a record’s value is null. ...
Server-Sent Events: Build Real-Time Web Apps with Minimal Code
In this technical deep-dive, we unravel the power of Server-Sent Events (SSE), a game-changing web technology that simplifies real-time communication. Developers often struggle with complex, resource-intensive methods for creating live updates, but SSE offers an elegant solution that’s both lightweight and powerful.
Configuring Kafka Connector with Docker: A Step-by-Step Guide
In this tutorial, we’ll walk through the process of setting up a Kafka connector using Docker and docker-compose.yml. We’ll focus on configuring a file connector, which is useful for reading data from files and writing data to files using Kafka. Link to repository Prerequisites Docker and Docker Compose installed on your system Basic understanding of Kafka and Docker concepts Project Structure Before we begin, let’s look at the files we’ll be working with: ...
Setting Up a Kafka Cluster Without Zookeeper Using Docker
Introduction In this post, we will walk through the process of setting up an Apache Kafka cluster without using Zookeeper, leveraging Kafka’s KRaft mode for metadata management. Kafka no longer requires a Zookeeper instance, simplifying the cluster setup. We will use Docker to deploy this Kafka cluster with the latest version of Kafka. Prerequisites Ensure that you have the following: Docker installed Docker compose installed Basic understanding of Kafka concepts (e.g., brokers, partitions) Step 1: Kafka Cluster Setup Let’s use the following Docker Compose configuration to set up the Kafka broker in KRaft mode: ...
Online Machine Learning: Real-time Image Classification with CIFAR10 Dataset and Kafka
This blog post explores online machine learning through a practical example using the CIFAR10 dataset and Apache Kafka.
Supercharging Your AI: Setting Up RAG with PostgreSQL and pgvector
Learn how to implement a powerful Retrieval-Augmented Generation (RAG) system using PostgreSQL and pgvector. This comprehensive guide covers everything from setting up a custom PostgreSQL Docker image to creating a fully functional RAG query system with vector embeddings and language model inference.
Git Aliases to Supercharge Your Workflow
Git aliases are a powerful workflow tool that create shortcuts to frequently used Git commands.
Mastering the Find Command: Unleashing Unix File Management Power
Discover the unparalleled capabilities of the ‘find’ command in Unix systems. From basic file searches to advanced manipulations, mastering ‘find’ empowers users to efficiently navigate, search, and manage files. With practical examples and powerful techniques, this blog unveils the full potential of ‘find’, transforming your Unix experience into one of seamless productivity and control. Unlock the power of Unix file management with the mastery of the ‘find’ command.