Pavan Kalyan — October 9, 2021
Beginner Data Engineering Docker Guide

This article was published as a part of the Data Science Blogathon

Introduction

It is not difficult to create a machine learning model that operates on our computers. It is more difficult when you are working with a customer who wants to use the model at scale, that is, a model that can scale and perform on all types of servers all over the world. After you have finished designing your model, it may function smoothly on your laptop or server, but not so well on other platforms, such as when you move it to the production stage or a different server. Many things can go wrong, such as performance issues, the application crashing, or the application not being effectively optimized.

A machine learning model had developed using a single programming language like Python but will almost certainly need to connect with multiple programming languages for data intake, data preparation, front-end, etc. Docker makes it easier to handle all of these interactions because each microservice can be built in a distinct language, allowing for scalability, and the quick addition, deletion of independent services. Reproducibility, portability, ease of deployment, granular updates, lightweight, and simplicity are all advantages of Docker.

Sometimes it is not the model that is the issue but the requirement to recreate the entire stack. Docker enables you to easily replicate the training and running environment for the machine learning model from any location. Docker allows you to package your code and dependencies into containers that can be transferred to different hosts, regardless of hardware or operating system.

Developers can use Docker to keep track of different versions of a container image, see who produced it with what, and roll back to prior versions. Finally, even if one of your machine learning application services is upgrading, fixing, or down, your machine learning application can continue to run. To update an output message integrated throughout the application, you do not have to update the whole application and disrupt other services.

Docker Logo
Image 1

Let’s dig in and start investigating Docker.

 

What is Docker!

It is a software platform that makes developing, executing, managing, and distributing applications easier. That had accomplished by virtualizing the operating system of the computer it had installed.

Docker’s first edition had launched in 2013.

The GO programming language had used for creating Docker.

Looking at the rich set of functionality Docker has got to offer, it’s been widely accepted by some of the world’s leading organizations and universities, such as Visa, PayPal, Cornell University and Indiana University (just to name a few) to run and manage their applications using Docker.

Now we try to understand the problem, and solution offered by Docker

Problem

Let us imagine you want to host three separate Python-based applications on a single server (which could either be a physical or a virtual machine). A different version of Python used by these programs, libraries and dependencies varies from application to application.

We are unable to host all three applications on the same workstation since various versions of Python can not be installed on the same machine,

Solution

Let’s see what we could do if we didn’t use Docker to tackle this problem. In this case, we might solve the problem with the help of three physical machines or by using a single physical computer that is powerful enough to host and run three virtual machines.

Both approaches would help us install various versions of Python, and their associated dependencies, on each of these machines.

Regardless of which solution we chose, the costs of purchasing and maintaining the hardware are substantial.

Let’s look at how Docker might be a viable and cost-effective solution to this issue.

To comprehend this, we must first examine it’s functionality.

Docker Host

Image 2

 

In simple terms, the system with Docker installed and running is referred to as a Docker Host or Host.

As a result, anytime you want to deploy an application on the host, it will build a logical entity to host that application. This logical object is known as a Container or a Docker Container in the Docker nomenclature.

There is no operating system installed or running on a Docker Container. However, a virtual replica of the process table, network interface(s), and file system mount point would be included (s).

It is passed further from the host operating system on which the container is hosted and executing. The kernel of the host’s operating system, on the other hand, is shared by all the containers executing on it.

It allows each container on the same host to be isolated from the others. As a result, it helps numerous containers with varied application requirements and dependencies to run on the same host as long as the operating system requirements are the same.

The next part, which addresses the advantages and downsides of adopting Docker, will help you understand how Docker helps to solve this challenge.

In other words, rather than virtualizing hardware components, Docker would virtualize the operating system of the host on which it had installed and running.

 

Pros and Cons of using Docker

Pros

  • Docker allows numerous programs with varied requirements and dependencies to be hosted on the same host as long as they use the same operating system.
  • Containers are typically a few megabytes in size and occupy relatively little disc space, allowing many applications hosted on the same host.
  • Robustness, There is no operating system installed on a container. As a result, it uses extremely little memory when compared to a virtual machine (which would have a complete operating system installed and running on it). It cuts the bootup time to only a few seconds, whereas it takes several minutes to start a virtual machine.
  • Cost is less when it comes to the hardware necessary to run Docker, and it is less demanding.

Cons

  • On the same Docker Host, we can not host applications together that have various operating system needs. Let’s pretend we have four separate programs, three of which require a Linux-based operating system and one of which requires a Windows-based operating system. The three apps that require a Linux-based OS can be on a single Docker Host. The application that requires a Windows-based OS must be on a separate Docker Host.

 

Docker Core Components

Docker Engine is one of the core components and is responsible for overall functioning.

It is a client-server based application with three main components.

  • Server
  • Rest API
  • Client
Components of Docker
Image 3

The Server executes the dockerd (Docker Daemon) daemon, which is nothing more than a process. On the Docker platform, it is in charge of creating and managing Docker Images, Containers, Networks, and Volumes.

The REST API defines how applications can interface with server and tell it how to complete their tasks.

The Client is a command-line interface that allows users to communicate with Docker by issuing commands.

 

Docker Terminologies

Let’s have a look at some of the terms used in the Docker world.

Docker Images and Docker Containers are the two most key items you’ll encounter while working with Docker regularly.

In simple terms, a Docker Image is a template that includes the program, dependencies needed to run it on Docker.

A Docker Container, on the other hand, is a logical entity, as previously indicated. It is a functioning instance of the Docker Image in more technical terms.

 

Docker Hub

Docker Hub is the official online repository where we can find all of the Docker Images that we can use.

If we like, we can also use Docker Hub to store and distribute our custom images. We could also make them public or private, depending on our needs.

Note: Free users can keep one Docker Image private. More than one requires a paid subscription.

 

Installation

Before we get our hands dirty with Docker, one last thing we need to know is that we need to have it installed.

The official Docker CE installation directions are linked below. These instructions for installing Docker on your PC are straightforward.

Do you wish to skip installation and start practicing Docker? 

If you’re too slow to install Docker or don’t have enough resources on your PC, don’t panic – there’s a solution to your problem.

Play with Docker, an online playground for Docker, is the best place to start. It enables users to immediately practice Docker commands without the need to install anything on their PC. The best part is that it’s easy to use and completely free.

Docker Commands

It’s finally time to get our hands dirty with Docker commands, as we’ve all been waiting for

docker create

The docker create command will be the first command we’ll look at

We can use this command to build a new container.

The following is the syntax for this command:

docker create [options] IMAGE [commands] [arguments]

Please keep in mind that everything placed in square brackets is optional. It holds for all of the instructions presented in this guide.

The following are some examples of how to use this command:

$ docker create fedora
02576e880a2ccbb4ce5c51032ea3b3bb8316e5b626861fc87d28627c810af03

The docker create command in the preceding example would create a new container using the most recent Fedora image.

It will verify if the latest official Fedora image is available on the Docker Host before building the container. If the most recent image isn’t accessible on the Docker Host, the container had initiated using the Fedora image downloaded from the Docker Hub. If the Fedora image is already present on the Docker Host, the container uses that image for creation.

Docker results in the container ID on successful creation of the container. The container ID returned by Docker is in the above example.

A container ID had assigned to each container. When executing various activities on the container, such as starting, stopping, resuming, and so on, we refer to it by its container ID.

Let’s look at another example of the docker create command, this time with parameters and command supplied to it.

$ docker create -t -i ubuntu bash
30986b73dc0022dbba81648d9e35e6e866b4356f026e75660460c3474f1ca005

The docker create command in the preceding example builds a container using the Ubuntu image (if the image isn’t available on the Docker Host, it will download the most recent image from the Docker Hub before building the container).

The -t and -i options tell Docker to assign a terminal to the container so that the user can interact with it. It also tells Docker to run the bash command every time the container starts.

docker ps

The docker ps command is the next we’ll look at

We can use the docker ps command to see all the containers currently executing on the Docker Host.

$ docker ps
CONTAINER ID IMAGE  COMMAND CREATED        STATUS            PORTS NAMES30986b73dc00 ubuntu "bash"  45 minutes ago Up About a minute                 elated_franklin

It only shows the containers that are running on the Docker Host right now.

To view the containers created on this Docker host, regardless of their current condition, whether it is running or not, you must use the -a option, which lists all containers created on this Docker Host.

$ docker ps -a
CONTAINER ID IMAGE  COMMAND     CREATED           STATUS       PORTS NAMES30986b73dc00 ubuntu “bash”      About an hour ago Up 29 minutes elated_franklin02576e880a2c fedora “/bin/bash” About an hour ago Created hungry_sinoussi

Let us understand the above output of the docker ps command.

CONTAINER ID: consists of a unique string with alphanumeric characters connected with each container.

IMAGE: Docker Image used to create the container.

COMMAND: After the start of the container, it runs any application-specific commands.

CREATED: It provides the elapsed time since the creation of the container.

STATUS: It provides the current status of the container.

If the container is running, it will display Up along with time elapsed. (Up About an hour or Up 5 minutes)

If the container is not running, the status will be Exited, with the exit status code enclosed in round brackets and the time expired. (Exited (0) 2 weeks ago or Exited (137) 10 seconds ago,)

PORTS: It provides port mappings described for the container.

NAMES: In addition to the CONTAINER ID, each container had given a unique name. A container can be identified by its container ID or by its unique name. Each container Docker generates and assigns a unique name by default. If you wish to change the container to a unique name, use the  –name option with the docker create or docker run commands.

I hope this helps you better grasp what the docker ps command returns.

docker start

The command helps to start any stopped containers.

docker start [options] CONTAINER ID/NAME [CONTAINER ID/NAME…]

To start the container, you can specify the first unique characters of the container ID or its name.

Below you can look at the example.

$ docker start 30986
$ docker start elated_franklin

docker restart

The command helps to restart any running containers.

docker restart [options] CONTAINER ID/NAME [CONTAINER ID/NAME…]

Similarly, we can restart by specifying the first unique characters of the container ID or its name.

Look at the examples using this command

$ docker restart 30986
$ docker restart elated_franklin

docker stop

The command helps to stop any running containers.

docker stop [options] CONTAINER ID/NAME [CONTAINER ID/NAME…]

It is related to the start command.

You can specify the first unique characters of the container ID or its name to stop the container.

Have a look at the below examples

$ docker stop 30986
$ docker stop elated_franklin

docker run

It first creates the container and then starts it. In summary, it is a combination of the docker create and start commands.

It has a similar syntax to docker create.

docker run [options] IMAGE [commands] [arguments]
$ docker run ubuntu
30fa018c72682d78cf168626b5e6138bb3b3ae23015c5ec4bbcc2a088e67520

In the above example, it creates a container using the latest Ubuntu image and starts the container, and immediately stops it. We can not get a chance to interact with it.

To interact with the container, we need to specify the options -it to the docker run command, then we can interact with the container.

$ docker run -it ubuntu
[email protected]:/#

Type exit in the terminal to come out of the container.

docker rm

We use this command to delete a container.

docker rm [options] CONTAINER ID/NAME [CONTAINER ID/NAME...]
$ docker rm 30fa elated_franklin

In the above example, we are instructing docker to delete two containers in a single command. We specify the ID for the first and the name for the second container for deletion.

The container should be in a stopped state to delete it.

docker images

The command lists out all docker images present on the docker host.

$ docker images
REPOSITORY  TAG      IMAGE          CREATED        SIZEmysql       latest   7bb2586065cd   38 hours ago   477MBhttpd       latest   5eace252f2f2   38 hours ago   132MBubuntu      16.04    9361ce633ff1   2 weeks ago    118MBubuntu      trusty   390582d83ead   2 weeks ago    188MBfedora      latest   d09302f77cfc   2 weeks ago    275MBubuntu      latest   94e814e2efa8   2 weeks ago    88.9MB

REPOSITORY: It describes the unique name of the docker image.

TAG: Each image is associated with a unique tag that represents a version of the image.

A tag had represented using a word or set of numbers or alphanumeric characters.

IMAGE: It is a string of alphanumeric characters associated with each image.

CREATED: It provides elapsed time since the image had been created.

SIZE: It provides the size of the image.

docker rmi

This command allows us to remove images from the docker host.

docker rmi [options] IMAGE NAME/ID [IMAGE NAME/ID...]
docker rmi mysql

The command removes image mysql from the docker host.

The below command removes images httpd and fedora from the docker host.

docker rmi httpd fedora

The below command removes the image with ID 94e81 from the docker host.

docker rmi 94e81

The below command removes image ubuntu with tag trusty.

docker rmi ubuntu:trusty

These are some of the basic commands you come across. There are numerous other instructions to explore.

Wind Up

Although containerization has been around for a long time, it has only recently received the attention it deserves. Google, Amazon Web Services (AWS), Intel, Tesla are just a few leading tech businesses with their specialized container engines. They rely significantly on them to develop, run, administer, and distribute their software.

Docker is an extremely powerful containerization engine, and it has a lot to offer when it comes to building, running, managing and distributing your applications efficiently.

You had seen docker at a high level. There is a lot to study about docker, like

  • Commands(More powerful commands)
  • Docker Images are a type of container (Build your custom images)
  • Networking with Docker (Setup and configure networking)
  • Stack of Docker (Grouping services required by an application)
  • Docker Compose is a tool that allows you to create a container (Tool for managing and running multiple containers)
  • Swarm of Dockers (Grouping and managing one or more machines on which docker is running)

If you’ve found this fascinating and want to learn more about it, I recommend enrolling in one of the courses listed below. They were educational and right to the point, in my opinion.

If you are a complete beginner, I recommend enrolling in this course, which has been prepared specifically for you.

If you have some Docker experience and are comfortable with the fundamentals but want to learn more, I recommend enrolling in this course, which focuses on advanced Docker subjects. It is a future-proof skill that is only now gaining traction.

Investing your time and money into studying Docker is not something you will regret.

End Notes

I hope you find this article helpful. Please feel free to share it. Thank you, have a great day.

 

Image Source:

  • Image 1: https://hub.docker.com/
  • Image 2: www.docker.com
  • Image 3: https://docs.docker.com/v17.12/engine/docker-overview/

 

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

About the Author

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Ram Dewani
  • Faizan Shaikh
  • Aniruddha Bhandari

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *