Docker¶
Setup¶
- https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04
sudo apt install docker.io
Since docker gonna eat up lot of disk space, it is idel to use HDD instead of SSD in case if you happened to ahve one!
Below are the steps to update the root folder
of Docker download path to store the files.
sudo systemctl stop docker
sudo mv /var/lib/docker/ /opt/binaries/
sudo rm -rf /var/lib/docker
sudo ln -s /opt/binaries/docker/ /var/lib/docker
sudo vim /etc/docker/daemon.json\:
{
“data-path”: “/opt/binaries/docker”,
“graph”: “/opt/binaries/docker”
}
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo ls /opt/binaries/docker
mageswarand@IMCHLT276:/opt/binaries/docker$ ls
builder buildkit containers image network overlay2 plugins runtimes swarm tmp trust volumes
It is very common to face network issues with docker, better equip with the basics @ https://pythonspeed.com/articles/docker-connection-refused/
Build our images¶
NER Image
docker build --network host -f docker/api/Dockerfile -t spacy-flask-ner-python:latest .
# start the app
docker run -it spacy-flask-ner-python:latest /bin/bash
docker run -d -p 5000:5000 spacy-flask-ner-python
# on a seprate terminal
curl -i -H "Content-Type: application/json" -X POST -d '{"text":"Ram read a book on Friday 20/11/2019"}' http://127.0.0.1:5000/spacy/api/v0.1/ner
Structured Streaming Image
Build
docker build --network host -f docker/ssp/Dockerfile -t sparkstructuredstreaming-pg:latest .
Run
docker run -v $(pwd):/host/ --hostname=$(hostname) -p 50075:50075 -p 50070:50070 -p 8020:8020 -p 2181:2181 -p 9870:9870 -p 9000:9000 -p 8088:8088 -p 10000:10000 -p 7077:7077 -p 10001:10001 -p 8080:8080 -p 9092:9092 -it sparkstructuredstreaming-pg:latest
Login into bash shell:
# first time
docker run -v $(pwd):/host/ --hostname=$(hostname) -p 50075:50075 -p 50070:50070 -p 8020:8020 -p 2181:2181 -p 9870:9870 -p 9000:9000 -p 8088:8088 -p 10000:10000 -p 7077:7077 -p 10001:10001 -p 8080:8080 -p 9092:9092 -it sparkstructuredstreaming-pg:latest /bin/bash
# to get bash shell from running instance
docker exec -it $(docker ps | grep sparkstructuredstreaming-pg | cut -d' ' -f1) bash
We are mounting current directory as a volume inside the container, so make sure you trigger from repo base directory, so that following steps works.
Misc¶
Common commands
https://www.edureka.co/blog/docker-commands/
start the services
sudo service docker start
or
systemctl start docker
systemctl enable docker
build the image
docker build --network host -f docker/api/Dockerfile -t spacy-flask-ner-python:latest .
run docker in interactive mode
docker run -ti spacy-flask-ner-python /bin/bash
start the app
docker run -d -p 5000:5000 spacy-flask-ner-python
list containers
docker container ls -a
remove/delete Docker images
docker rmi id#
docker images -f dangling=true
docker system prune
docker images purge
stop all the services/containers
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
When there is a change in the python code base, we obviously have to rebuild the docker image, isn’t? Use following steps to do so:
docker container ls
docker stop {id}
docker rm {id}
docker build ...
# for multiple shells for same container
docker exec -it <container> bash
Mount Host Folder
sudo apt-get install virtualbox-guest-x11
sudo mount -t vboxsf /opt/vlab/spark-streaming-playground/ /mnt/dockerfolder
References
https://www.bogotobogo.com/DevOps/DevOps-Kubernetes-1-Running-Kubernetes-Locally-via-Minikube.php
https://blog.adriel.co.nz/2018/01/25/change-docker-data-directory-in-debian-jessie/
https://rominirani.com/docker-tutorial-series-part-7-data-volumes-93073a1b5b72
https://medium.com/rahasak/kafka-and-zookeeper-with-docker-65cff2c2c34f
https://github.com/sameersbn/docker-postgresql
https://github.com/kibatic/docker-single-node-hadoop/
https://github.com/bbonnin/docker-hadoop-3/blob/master/Dockerfile