What You Should Know About Table Partition Pruning

Table partitioning is a very handy feature supported by several databases, including PostgreSQL, MySQL, Oracle, and YugabyteDB. This feature is useful when you need to split a large table into smaller independent pieces called partitioned tables or partitions. If you’re not familiar with this feature yet, consider the following simple example. Let’s pretend you develop … Read more

Five Tips to Fasten Skewed Joins in Apache Spark

Joins are one of the most fundamental transformations in a typical data processing routine. A Join operator makes it possible to correlate, enrich and filter across two input datasets. The two input datasets are generally classified as a left dataset and a right dataset based on their placement with respect to the Join clause/operator. Fundamentally, … Read more

java – Kafka Consumers under the same group are consuming the same partition after rebalancing

According to Kafka documentation: Kafka provides the guarantee that a topic-partition is assigned to only one consumer within a group. But I’m observing different behavior in my service. Here are some details: I’m using Kafka 2.8 and spring-kafka 2.2.13. Initially I had one Kafka topic topic.1 with 5 partitions in it, this topic was consumed … Read more