How to run the first Spark job on a Kubernetes cluster
Apache Spark 2.3 brought initial native support for Kubernetes. With the recent release of Spark 2.4 the integration has been improved and client mode is now supported. Time to go through step-by-step and run some primitive Spark jobs.…
Having a jump host, bastian host (on AWS, on GCP) or sometimes called edge node is a common way to access computing resources which are not accessible otherwise. In this post I’ll explore how to create a Kubernetes pod acting as such a jump server within the cluster, which is then used to create more pods. I’ll need this later when running Spark on Kubernetes, but you can use it to be able to check the cluster’s internal network, too.…