# Packaging & Docker

How do we package software so that it executes in a consistent way on a user's computer? Remember that our ultimate goal is to produce software that users can easily install and use. It's not as simple as just giving users an executable; there are often other files that need to be installed in specific locations in order for your application to run properly.

# Installers

The industry-solution for a long time has been to generate installers; a program that installs your program and all related libraries and assets.

When run, an installer will typically perform tasks like:

Create a folder for your executable program and copy it to that location.
Install any required dependencies e.g., system libraries that your application might use. These are typically installed to a central location e.g., c:\Program Files\Windows\System32 on Windows, or \Library on macOS.
Install any other user-facing assets, like documentation.
Register your program as needed so that it shows up in the Start Menu (Windows) or Applications folder (macOS).

Gradle has built-in support for creating installers that will perform these steps for you. The actual details vary based on the type of application you are building.

# Command-line applications

A console or command-line application is typically deployed as a JAR file, containing all classes and dependencies. The Gradle application plugin has the ability to generate a JAR file, and package that with a script that can be used to launch your application.

In Gradle:

Tasks > distribution > distZip or distTar

This will produce a ZIP or TAR file in build > distribution. If you uncompress it, you will find a lib directory of JAR file dependencies, and a bin directory with a script that will execute the program. To install this, you could place these contents into a folder, and add the bin to your $PATH.

# Desktop/Compose applications

GUI requirements are more complex: you need to generate a platform specific executable, which can install libraries and other assets to the specific locations on your file system suitable for that operating system. For this reason, you need a more complex installer than what we might use for a command-line application.

We use the term installer to refer to software that installs other software. For example, if you purchase and download a Windows application, it will typically be delivered as an MSI file; you execute that, and it will install the actual application in the appropriate location (after prompting you for installation location, showing license terms etc).

Compose Multiplatform includes support for generating platform-specific installers.

In Gradle:

Tasks > compose desktop > packageDistributionForCurrentOS

This task will produce an installer for the platform where you are executing it e.g., a PKG or DMG file on macOS, or an MSI installer on Windows.

You can only generate installers for your current platform. Supported formats:

MSI - Windows executable
DMG - macOS bundle
DEB - Linux bundle

# Services

Services are more complex. A web service is usually bundled into a special JAR file format, and executed by a web server. You will likely need a custom solution depending on how you intend to deploy the service.

# Where Installers Fail

Installers have limits; they assume that you are installing your software into an environment where it will run successfully. However, your operating environment can affect how software executes:

You may have tested on a different version of the operating system than the user has, so your software may work differently on their system.
Your application might rely on other software to be installed e.g., sed or a particular version of the bash shell.
The runtime environment itself might need to be configured in a specific way for your application to run correctly e.g., with environment variables holding private keys, like AWS_SECRET='KASDJFTG_&JGJMHGF_!@GHHY@', or specific network configurations or other complex runtime details.

It's not uncommon for software to be deployed using an installer, and then have it fail to run due to some of these concerns; something in the user's environment is intefering or preventing it from running properly.

How do we fix this? We specify and control the deployment environment. Ideally, we figure out how to specify the runtime environment so that it's identical to the environment where our software is designed to run properly.

# Virtualization

Let's consider virtualization as one solution.

Virtualization uses software to create an abstraction layer over computer hardware, enabling the division of a single computer's hardware components—such as processors, memory and storage—into multiple virtual machines (VMs). Each VM runs its own operating system (OS) and behaves like an independent computer, even though it is running on just a portion of the actual underlying computer hardware. -- IBM (2024).

We can see an example of this in the diagram below.

Standalone: This is a typical deployment environment, where there is a single operating system running on dedicated hardware. Software is installed directly into the operating system.

Applications share resources, which the OS has to allocate and manage.
There are security concerns challenges when applications are installed together (i.e., "what if another application is malicious; can it steal my data?").

Virtualization: Multiple virtual machines can be run on the same hardware. Each one is an abstraction of a physical machine, with its own resources and dependencies.

Each virtual machine is running a complete OS. Can be resource intensive, since each VM is allocated its own memory, has its own CPU cycles etc.
Provides the ability to adjust how physical resources are shared across VMs (e.g. if we had 128 GB of RAM, we could split it among VMs in any way that made sense).
Provides isolation of each application into its own OS instance (i.e. improved security).

Container: an isolated environment for running an application.

Run applications (not the OS) in isolation; containers run on the same underlying host OS.
Tha abstraction is at a lower level; the host OS schedules CPU, resources to the containers not VMs.
Containers are just processes that use the OS of the host to run an image containing the application/
Compred to a VM, this is very lightweight, fast to startup.

There are significant advantages to using containers over virtual machines:

Containers are significantly smaller than virtual machines, and use fewer hardware resources.
You can deploy containers anywhere, on any physical and virtual machines and even on the cloud.
Containers are lightweight and easy to start/stop and scale out.

A container represents a fixed, reproducible environment everywhere that you deploy it. It also provides a simple way to abstract/save/version a complete working environment and not just the source code that you use to build your software.

If containers are so great, why do we still have virtual machines? Containers are limited to abstracting a process. They can't easily manage complex environments that are closely tied to the underlying host OS or hardware. For example, anything heavily graphical.

# Docker

Docker is a very popular containerization platform that we'll use to create runtime containers. Installing Docker software provides you with the Containerization Runtime (above), plus the tools to create and deploy your own containers.

# Installation

Download and install directly from the Docker website, or your favorite package manager. Make sure to install the correct version for your system architecture (I'm looking at you, Apple ARM).

Check that it's installed and available on your path.

$ docker version
Client: Docker Engine - Community
 Version:           27.3.1
 API version:       1.47
 Go version:        go1.23.1
 Git commit:        ce1223035a
 Built:             Fri Sep 20 11:01:47 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.34.3 (170107)
 Engine:
  Version:          27.2.0
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.21.13
  Git commit:       3ab5c7d
  Built:            Tue Aug 27 14:15:41 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.20
  GitCommit:        8fc6bcff51318944179630522a095cc9dbf9f353
 runc:
  Version:          1.1.13
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

# Concepts

To use Docker, you create a Dockerfile -- a configuration file that describes the runtime environment.

You can then use Docker to use that Dockerfile to generate an image of your application + environment. You can think of an image as a template that you can use to create running instances of your application. You can upload Docker images to a registry so that other people can download and use them (not required, but supported as a means of distributing images).

DockerHub is Docker's public registry. It's simple to use Docker tools to upload your final image there, and for other people to pull and run your application through that image. You can think of DockerHub as GitHub for Docker images.

Finally, a container is a running instance of that image.

# How do we use it?

This is the basic workflow to creating a Docker image for your application.

Create an image, which includes both your application and a Dockerfile.
Tell Docker to run this image in a container if you wish to run it locally.
Upload the image to the Docker registry, which allows someone else to download and run it on a different system.

A docker image contains everything that is needed to run an application:

a cut-down OS kernel

a runtime environment e.g. jvm

application files

third-party libraries

environment variables

Let's build a simple application, and then turn it into a Docker image.

# Step 1: Write a program

fun main() {
  println("Hello Docker!")
}

$ kotlinc Hello.kt -include-runtime -d Hello.jar
$ java -jar Hello.jar 
Hello Docker!

We now have an application packaged into a JAR file.

# Step 2: Create a Dockerfile

To deploy this JAR file, we need to create a Docker image that contains it. To do this, we'll create a Dockerfile in the same directory as your JAR file above.

Dockerfile

# start with this image, which includes a Linux kernel running Java JDK 17
FROM openjdk:17

# import your Hello.jar file, and host in the app subdir.
# at runtime, your filesystem will expose this /app subdir
COPY Hello.jar /app

# set /app as your working directory and `cd` to it
WORKDIR /app

# run the application
CMD java -jar Hello.jar

You can find suitable Docker images on https://hub.docker.com. In this case, we're using Tamurin JDK as our base image (Linux/Java installation).

We're done creating the Dockerfile. Next we need to use this to create an image.

# Step 3: Create a Docker image

To create the image, run this command in the directory containing the Dockerfile and the JAR file:

$ docker build -t hello-docker .

-t tells Docker to tag it with a version (defaults to latest).
hello-docker is the name that will be assigned to our image.
. indicates that it should include the current directory's contents in the image.

To see the image that we've created:

$ docker images
REPOSITORY     TAG       IMAGE ID       CREATED         SIZE
hello-docker   latest    a615e715b56d   7 seconds ago   455MB

To run our image:

$ docker run hello-docker
Hello Docker!

To run a long-running program (i.e. a program that doesn't halt after execution), use the -d flag.

$ docker run -d hello-docker-no-halt

Keep in mind that Docker creates an image from your directory contents; when you docker run, you are running the contents of that image. Any changes that you make after creating the image will not be reflected, since Docker doesn't recompile or rebuild anything! If you make changes to your source code, you will need to manually recompile and rebuild the jar file, and create a new image to make these changes available.

# Step 4: Publish your image (optional)

To make this image available to other systems, you can publish it to the Docker Hub, so that it's available to download. See Docker repos documentation for more details.

Create an account on Docker Hub if you haven't already. Login.
Create a repository to hold your images.
Tag your local image with your username/repository.
Push your local image to that repository.

$ docker image ls
REPOSITORY     TAG       IMAGE ID       CREATED         SIZE
hello-docker   latest    f81c65fd07d3   3 minutes ago   455MB

$ docker tag f81c65fd07d3 jfavery/cs346

$ docker push jfavery/cs346:latest
The push refers to repository [docker.io/jfavery/cs346]
5f70bf18a086: Pushed
8768f51fa877: Pushed
5667ad7a3f9d: Pushed
6ea5779e9620: Pushed
fb4f3c9f2631: Pushed
12dae9600498: Pushed
latest: digest: sha256:6ddd868abde318f67fa50e372a47d4a04147d29722c4cd2a59c45b97a413ea22 size: 1578

# Pull your image to a new machine

To pull (download) this image to a new machine, use docker pull.

$ docker pull jfavery/cs346
Using default tag: latest
latest: Pulling from jfavery/cs346
0509fae36eb0: Pull complete
6a8d9c230ad7: Pull complete
0dffb0eed171: Pull complete
77de63931da8: Pull complete
dc36babb139f: Pull complete
4f4fb700ef54: Pull complete
Digest: sha256:6ddd868abde318f67fa50e372a47d4a04147d29722c4cd2a59c45b97a413ea22
Status: Downloaded newer image for jfavery/cs346:latest
docker.io/jfavery/cs346:latest

Once it's downloaded, you can run it normally.

$ docker images
REPOSITORY      TAG       IMAGE ID       CREATED          SIZE
jfavery/cs346   latest    f81c65fd07d3   10 minutes ago   455MB

$ docker run jfavery/cs346
Hello CS346!

# Persisting data

When you launch a container, it does the following:

Creates a new environment from the image.
Initializes the container with its own mutable environment and data.
Run the program (specified by CMD)

This works great, until you stop the container; when you restart it, the environment is recreated, and you lose any previous data!

How do we avoid this problem? You create a volume on the host OS, outside the scope of the container, and then provide the container access to the volume. For example, we can create a data file that will persist after container restarts.

# create a volume on the host
# we attach it at runtime below
$ docker volume create data-storage

# data-storage is the volume we created
# /data is a container directory that maps to the volume
# it gets mounted in our container and visible to the application
$ docker run -v data-storage:/data jfavery/cs346

You then just need to ensure that your application can read/write from this volume (at the location specified in the container).

# Web services

Containers are commonly used to deploy server applications, including web services. Services have unique requirements compared to standard applications -- namely the need to manage network requests that originate from outside the container. Docker can handle this, with some additional configuration.

The following example is a service that manages get and post requests. The service is designed to monitor port 8080, so we need to ensure that our container maps this port properly from the host environment to the container environment.

# Dockerfile
FROM openjdk:17
VOLUME /tmp
EXPOSE 8080
ARG JAR_FILE=target/service-docker.jar
ADD ${JAR_FILE} app.jar
ENTRYPOINT ["java","-jar","/app.jar"]

FROM: the starting image (Linux + JVM)
VOLUME: mapping an external volume for file storage
EXPOSE: port that our application we will listen on
ARG: passing in JAR_FILE arguments pointing to our application's JAR file
ADD: remap our JAR file to a local/internal JAR file that will be executed
ENTRYPOINT: how to run the JAR file

To build the Docker image:

$ docker build -t docker-service .

[+] Building 1.9s (8/8) FINISHED
 => [internal] load build definition from Dockerfile
 => => transferring dockerfile: 69B
 => [internal] load .dockerignore
 => => transferring context: 2B
 => [internal] load metadata for docker.io/library/openjdk:17
 => [auth] library/openjdk:pull token for registry-1.docker.io
 => [internal] load build context
 => => transferring context: 57.63MB
 => CACHED [1/2] FROM docker.io/library/openjdk:18@sha256:9b448de897d211c9e0ec635a485650aed6e28d4eca1efbc34940560a480b3f1f
 => [2/2] ADD build/libs/service-docker.jar app.jar
 => exporting to image
 => => exporting layers
 => => writing image sha256:2593c79e75b19b36dd2b0ee16fca23753578fb6381fb6d14f5c5e44fc0162bb4
 => => naming to docker.io/library/docker-sprint-boot  

When we run the container, we need to specify that we want to map port 8080 from the outside environment into the container. We can do this using the -p command-line option:

$ docker run -p 8080:8080 docker-service
  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v2.7.4)

2023-03-26 16:31:11.453  INFO 1 --- [           main] com.example.demo.DemoApplicationKt       : Starting DemoApplicationKt using Java 17.0.2 on d2a3849df55b with PID 1 (/app.jar started by root in /)
2023-03-26 16:31:11.455  INFO 1 --- [           main] com.example.demo.DemoApplicationKt       : No active profile set, falling back to 1 default profile: "default"

Our web service is now running in a container! We can now access the web service as-if it was running locally on port 8080.