Streaming ETL with Apache Kafka in the Healthcare

IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores how data streaming with Apache Kafka enables real-time data processing and business process automation. Real-world examples show how traditional enterprises and startups increase efficiency, reduce cost, and improve the human experience across the healthcare value chain, including pharma, insurance, providers, … Read more

Five Tips to Fasten Skewed Joins in Apache Spark

Joins are one of the most fundamental transformations in a typical data processing routine. A Join operator makes it possible to correlate, enrich and filter across two input datasets. The two input datasets are generally classified as a left dataset and a right dataset based on their placement with respect to the Join clause/operator. Fundamentally, … Read more

python – Getting Errors while starting mod wsgi (Django app) via apache

Starting up a dockerised django application as a mod wsgi app via apache. Getting an endless stream of below errors. Errors: 2022-06-02 16:05:12.225137 [notice] [pid 10002] mpm_unix.c(436): [client AH00052: child pid 10992 exit signal Aborted (6) 2022-06-02 16:05:12.225382 [info] [pid 10002] src/server/mod_wsgi.c(7761): [client mod_wsgi (pid=10992): Process ‘some.someapp.com’ has died, deregister and restart it. 2022-06-02 16:05:12.225411 … Read more

java – Apache POI upgrade to 5.0 error for Word document

We are upgrading Apache poi library from 3.17 to 5.0 version, when we update poi 5.2 dependency in POM.XML, getting the below mentioned error, while creating the document from template. Code to create document String relativeUrl = “/report-templates/DocTemplate.docx”; XWPFDocument wordDoc=new XWPFDocument(getClass().getClassLoader().getResourceAsStream(relativeUrl)); wordDoc.enforceUpdateFields(); Error Message “Handler dispatch failed; nested exception is java.lang.NoSuchMethodError: ‘org.apache.xmlbeans.XmlObject[] org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTFootnotesImpl.getXmlObjectArray(javax.xml.namespace.QName, org.apache.xmlbeans.XmlObject[])'”, Dependency … Read more

Apache Kafka stops deleting topic as per retention policy

We run Apache Kafka and we have several topics there and because of the amount of data we keep them only for 4 hours: log.retention.hours=4 log.retention.check.interval.ms=300000 And it works as expected and thanks to the load the data does not stay that long after 4 hours so that is fine. But now for second time … Read more

apache zookeeper – Kafka Authentication with SASL_PLAINTEXT fails

Here’s the log: kafka 16:54:47.56 kafka 16:54:47.57 Welcome to the Bitnami kafka container kafka 16:54:47.57 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-kafka kafka 16:54:47.57 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-kafka/issues kafka 16:54:47.57 kafka 16:54:47.57 INFO ==> ** Starting Kafka setup ** kafka 16:54:47.62 DEBUG ==> Validating settings in KAFKA_* env vars… kafka 16:54:47.64 WARN … Read more

Apache Kafka Essentials – DZone Refcardz

Apache Kafka runs in distributed clusters, with each cluster node being referred to as a Broker. Kafka Connect integrates Kafka instances on Brokers with producers and consumers — clients that produce and consume event data, respectively. All these components rely on the publish-subscribe durable messaging ecosystem to enable instant exchange of event data between servers, … Read more