Spark on K8s¶
Google has adopted Kubernetes as cluster manager, so even our client wanted to test out Kubernetes for Spark Applications
Requirements¶
Setup local Kubernetes cluster to run Spark examples
Implementation¶
Read the Spark Kubernetes docs.
Make sure the Spark version > 2.4.5 for this to work seamlessly.
Handy Links
A MakeFile was put in place to download all needed binaries, prepare the docker image with respect to Spark and use the Spark image to run the example locally
cd /path/to/spark-streaming-playground/kubernetes/spark/
Install k8s tooling locally, start minikube, initialize helm and deploy a docker registry chart to your minikube:
make
If everything goes well, you should see a message like this: Registry successfully deployed in minikube. Make sure you add 192.168.99.100:30000 to your insecure registries before continuing. Check https://docs.docker.com/registry/insecure/ for more information on how to do it in your platform.
In simple words, you needs to add an entry as follows:
sudo vim /etc/docker/daemon.json # and add followning line
{
"insecure-registries" : ["192.168.99.100:30000"]
}
Restart the docker…
sudo systemctl daemon-reload
sudo systemctl restart docker
docker info
You should see following log:
Insecure Registries:
192.168.99.100:30000
127.0.0.0/8
Push the spark images to our private docker registry
make docker-push
HINT: if you see “Get https://192.168.99.100:30000/v2/: http: server gave HTTP response to HTTPS client” go back and check whether you have it listed in your insecure registries
Once your images are pushed, let’s run a sample spark job (first on client mode):
$SPARK_HOME/bin/spark-submit \
--master k8s://https://$(minikube ip):8443 \
--deploy-mode client \
--conf spark.kubernetes.container.image=$(./get_image_name.sh spark) \
--class org.apache.spark.examples.SparkPi \
$SPARK_HOME/examples/jars/spark-examples_2.11-2.4.5.jar
Limitations / TODOs¶
Explore more on the Kubernetes driver options
Explore how to run the example on AWS
References
https://tech.olx.com/running-spark-on-kubernetes-a-fully-functional-example-and-why-it-makes-sense-for-olx-d56b6a61fcbe
https://github.com/olx-global/spark-on-k8s