Big Data Processing, Apache Spark: Design Patterns

In my previous post, I demonstrated how Spark creates and serializes tasks. In this post, I show how to utilize this knowledge to construct Spark applications in a maintainable and upgradable way, where at the same time “task not serializable” exceptions are avoided. When I participated in a big data project, I needed to program … Read more

Image Processing in Python — Draw Aesthetic Portraits Using Only Nails and a Thread | by Ilias Nahmed | Aug, 2022

Unleash the artist that lives within you. Marilyn Monroe by Georgia Fowler. Right photo by Author. Artists have become more and more creative, so much so that they might have us thinking: “how did they even know they could do that”. One way of drawing portraits that have become pretty famous on Instagram reels and … Read more

Resilient Kafka Consumers With Reactor Kafka

We are a recipe for creating resilient Kafka consumers using Reactor Kafka. This approach is one that we’ve developed over time and incorporates the learnings from our experience with running Reactor Kafka – and all the challenges that come with that. The consumer described in this article provides at-least-once Delivery semantics using manual acknowledgments, which … Read more

nullpointerexception – One day all of my programs on processing written in java stoped working. How to solve these errors?

I am trying to run programs on processing I coded. They used to work fine, but one day they all stopped working and started showing the same error messages. the code below is one example. int cellSize = 10; float p0 = 50; float p1 = 25; float p2 = 0; float p3 = 0; … Read more

Retry consuming messages while processing concurrently using reactor kafka

I have a project where I consume messages from an incoming topic process it and send them to outgoing topic. Now consider that the outgoing topic is down because of some infrastructure issue resulting in error while sending the message. In this case, I am committing till the last successful message (processed and send to … Read more

Using Streaming, Pipelining, and Parallelization

Satya Nadella, the CEO of Microsoft, once said, “Every company is now a software company.” Take the online food ordering business; For example, digital ordering and delivery have grown 300% faster than dine-in traffic since 2014. During Covid, online ordering grew 3,868% between February and April in large suburbs in the United States. As a … Read more

Error processing tweet JSON in R function: missing value where TRUE/FALSE needed

I am using a function which takes a raw tweet JSON file as input and outputs the retweet cascades. Here is a part of the function: if (api_version == 2) { parse_tweet <- function(tweet, keep_text = F) { tryCatch({ json_tweet <- jsonlite::fromJSON(tweet) id <- json_tweet$data$id magnitude <-zero_if_null(json_tweet$includes$users$public_metrics$followers_count) user_id <- json_tweet$data$author_id retweet_id <- NA if (keep_text) … Read more

File Processing Using Concurrency With GoLang | by Yair Fernando | May, 2022

Master concurrency in Go GoLang has incredible support for concurrent programs and in this article, we’ll see how we can optimize a program that processes a CSV file to send SMS notifications to its users. If you’re new to using GoLang and want to get a better understanding of how concurrency works, I’d recommend reading … Read more

Fast Image Processing in Android With Halide | by Minhaz | May, 2022

I have written about how Halide allows us to write both fast and maintainable code. This one will show its power with Android applications. Photo by Dan Smedley on Unsplash. Halide is an open-source domain specific language designed to make it easier to write and maintain high-performance image processing or array processing code on modern … Read more