Five Tips to Fasten Skewed Joins in Apache Spark

Joins are one of the most fundamental transformations in a typical data processing routine. A Join operator makes it possible to correlate, enrich and filter across two input datasets. The two input datasets are generally classified as a left dataset and a right dataset based on their placement with respect to the Join clause/operator. Fundamentally, … Read more

Gitlab-CI with Docker executor /usr/bin/bash: line 90: git: command not found

I have a local gitlab server and gitlab-ci runner with docker executor. I want to use gitlab-ci to build (for the first stage) my maven project. Since I use buildnumber-maven-plugin I added a git service to my gitlab-ci.yml like this: image: maven:latest services: – alpine/git:latest # Cache goes here cache: paths: – .m2/repository – frontend-app/node_modules/ … Read more