Scaling Kafka Ingestion Using REST APIs in Rust | by Shanmukh Sista | Jun, 2022

High throughput data ingestion system

This article is all about working with Kafka and Rust. As an idea, I’m trying to build a data ingestion service that accepts Semi Structured Json Payloads and publishes them to Kafka. More specifically I want to hit the theoretical max for an ingestion service that works with Kafka as its buffer and try to tune it to measure performance.

I’ll share on how to write a REST API using Rust and run a simple HTTP benchmark for performance tuning. I’ll also show different producer types with rdkafka,that can be used to get good durability and an extremely high throughput for advanced use cases. Everything has been implemented in Rust. We’ll look at numbers before and after Kafka tuning to test the performance. Spoiler Alert — I could achieve around 96K events / second of publishing rate on a single 8C32M server. I’ll share the details in this article.

All the code is available within my repository.

Let’s dive in !

This application setup is simple. It accepts a payload of JSON Data ( events ) via a REST API ( POST Call) and publishes these events to a Kafka topic. Publishing to Kafka is the interesting part. We’ll cover publishing for two different scenarios.

  1. Publish to get some level of durability. In this case we ensure that the message is delivered to Kafka with some guarantee.
  2. Publish in a way that optimizes for higher performance. In this case, we’ll look at using a producer that buffers some events before publishing, and optimize for the best performance without blocking for any response.

Since performance is critical, we’ll then write a benchmark test and measure it using a simple HTTP Testing tool ( plow ).

As a side note, the goal of using Kafka Producer while writing APIs in rust is also to highlight the ability to pass in some state/service to the Handlers in RUST. This will be quite common when writing real-world applications. It may even be a database pooling service or custom resource manager. It’s important to understand this because the programming paradigms in Rust are quite different that application programming languages ​​like Java, C# …

For this demo, we’ll be working with Actix Framework ( Web ) to write the API and publish events. Let’s get started.

Let’s start by creating a new Application.

cargo new rust-rdkafka-demo

Add the following dependencies in your Cargo.toml.

We add serde dependency to parse request JSON and deserialize it to Rust Structures before publishing the data.

Another prerequisite for this application is to install kafka on your local machine , or have remote broker urls handy for writing the service. We’ll connect to this kafka cluster from within our application.

In the first case, we’ll work with a Kafka Producer that ensures messages are sent to Kafka before returning a response. This means that there will be latency and performance hits because we are waiting on some network operation, and for Kafka to acknowledge the message.

Initialize the Producer

We initialize a producer by setting some default config variables and return the ownership of Producer within this function.

This producer is a FutureProducer instance, which can be shared across threads. Since cloning is cheap for this producer, we can publish an event with every request, and scale for high throughput. This producer also produces a future for every event that is published. Hence, we must perform an await to publish and get results from Kafka.

In this case, we’re creating the producer with a minimum required configuration.

Define payload Structure

Now, let’s define our payload data as a rust struct. For the purposes of this demo, a payload is an event object with a name associated with it and consists of a data section which can hold arbitrary key — value pairs sent to the server. This is simply semi-structured JSON data sent as key-value pairs.

Make sure that you add the Derive attributes to serialize and deserialize payloads to and from JSON.

Now that we have our basic application setup, there is one last piece that we need to write for handling requests.

Define API Handlers

The API handlers for executing our API logic look like this:

collect_events will be called every time a POST call is made to the root/ endpoint. Within this method, we also get our Kafka Producer as a data dependency. We can use it to directly publish events and return the response.

Note that the Kafka producer is a FutureProducer by default. Unless we await on the producer ( line 17 ), no messages will be published to Kafka. For the initial version, let’s perform an await operation and measure some request latencies. We’ll look at an alternate approach soon.

Within this function, we generate a new event_id for this event. And send this event back to Kafka.

Also note that we are deserializing and serializing again without applying any logic in between. In real world applications, there may be some enrichments that may happen as well. Hence, we need this step. A better option would be to use Protocol Buffers Instead of JSON serialization / Deserialization to reduce serialization costs and standardize payloads better.

Once events are published, we return a text response with the event_id.

The Main Function

The last step to finish our application is to integrate everything and update our main function. Refer to the code below to check it out.

Within main, we first initialized our kafka producer. This producer will be shared across all threads that are created within actix-web. Main function gets ownership of this data and is then responsible for the lifetime of this variable.

Then, we use Actix-Web’s server factory to define the server and its route. A key thing to note here is that Actix only defines the layout and the initialization here. We really don’t initialize and start using the Kafka producer until we get a request from the client. The lambda input to is just a factory. A new instance is created for every request that is received to our server.

While working with the route, we get ownership of the payload within our handler. This handler also gets the shared kafka_producer reference along with the payload. This is all it takes to run our application.

Running and Testing the Application

To run the application, execute the command below within the root of the project.

cargo run

Fire a POST request to test publishing of events to our service.

If everything works, you should expect a new event id in the response as a Plain Text output.


The approach shown above waits for async futures after publishing events to Kafka. Even though it’s an async operation, there may be some wait, no matter how good the network may be.

There’s another approach that we will look at to optimize this further. A non blocking Producer that doesn’t have to wait on futures.

We’ll leverage the low-levelThreadedProducer class to send events. Internally, this class has it’s own dedicated thread to buffer and publish events. In this implementation, we’ll create a producer with this class to send events. This should give us significantly higher throughput without much change in code. Below are the code changes.

Initialize Producer

API Handler

We can see here that we are not waiting for the Kafka send operation to complete. We’re just publishing and returning a response immediately. There are mutexes to guarantee thread safety to share the producer without any side effects.

Main Function

Now that we have both approaches, let’s look at some performance numbers.

To measure performance, we’ll use a set of remote machines within GCP to execute the test. It’s a simple HTTP Test performed from another remote machine. A brand new VM was created to host our application on a single instance. We’ll compare both the producers and see the distribution of latencies and bandwidth.


API Server Machine : 8 core 32 GB RAM — GCP Instance

Kafka — General Kafka Cluster in the same Local Network.

Load Test Run from a separate Machine within the same network.

Futures Producer ( Await Operation )

For both the producers, we’ll start with Plow to fire 50 concurrent requests for a duration of 1 minute. Below is a screenshot for the same.

For this setup, I noticed that there is an initial delay before the events are ingested with high throughput. The throughput scaled from 4000 / sec to 45k in a few seconds. That’s where the latencies were around 6ms initially. This is not at all bad at all considering we are publishing every record to kafka and waiting for a response. Let’s look at the other approach now.

Now let’s look at an approach that will drastically improve the performance of our service. This may not be the most accurate approach, but it does improve the performance significantly. I’m still researching some ways and understanding the underlying library for the best approach.

Note that our goal with this approach was a high throughput for our ingestion service. In this case, we used a producer that didn’t require us waiting on a future.

The service was reaching over 95k Events / Sec being ingested into Kafka. The latencies don’t have a lot of variances in them as well. I’m really impressed by this performance even though this was just a basic setup.

In summary, this was quite a simple implementation. I’m impressed by the numbers that I saw. Soon, I’m going to try and perform some expensive operations over the data payloads and see how these numbers are impacted.

But personally, if I were designing a high throughput analytics data collection service, I would consider Rust as a choice among other considerations and factors. Not only because of performance, but for energy efficiency as well 🙂

Leave a Comment