Docker images within Saagie

The following information will cover how Saagie uses Docker images to launch jobs and apps. It will also discuss how Docker images must be created so that they function within Saagie.

This documentation assumes a basic understanding of Docker, including how to create a Docker image. To review your Docker basics, refer to Docker resources.

1. How Saagie uses Docker images

Saagie jobs and apps are Docker images launched as Kubernetes jobs. Once a Saagie job or app terminates—​whether it succeeds or fails—​it will not be relaunched.

There are some key differences between Saagie jobs and apps:

  • Jobs work with the command line and packages.

  • Apps expose ports and mount volumes, then share that information with all instances of the app.

1.1. Launching Saagie jobs and apps

When a Saagie job or app is launched, elements are added to the Docker image.

For both jobs and apps, the following are added to the Docker image:

  • Environment variables (previously determined in the project or platform)

  • Configuration files mounted to standard paths (includes datalake configuration files and how tools, such as Kerberos, are configured)

A Saagie job also adds the following to the Docker image:

  • If the chosen technology requires a package (example: .zip, .py, .jar), it will be mounted in the job.

    • Packages are uploaded when a job is configured.

    • Saagie will replace {file} with the package’s absolute path in the command line from the job configuration process.

    • The package is not altered (for example, it is not decompressed or renamed).

  • The command line is saved in the file main_script and mounted as executable on the path /sandbox/main_script.

    • The ENTRYPOINT of your Docker image must run this file if you have the COMMAND_LINE feature. This is discussed in more detail in the next section.

A Saagie app also adds the following to the Docker image:

  • Mounted directories corresponding to the volume selected when creating the job.

Jobs and Apps

2. Create Saagie-compatible Docker images

In order for your Docker images to be compatible with Saagie, consider the information below.

2.1. Jobs

When using the COMMAND_LINE, use a Dockerfile to create a new Docker image. The Dockerfiles Saagie uses to create our Docker images are below.

Dockerfile
COPY entrypoint /entrypoint
RUN chmod 755 /entrypoint

WORKDIR /sandbox

ENTRYPOINT  ["bash","/entrypoint"]
entrypoint
#!/bin/bash

set -euo pipefail


if test -f /sandbox/main_script;
then sh /sandbox/main_script;
fi;

The command line is saved in the file /sandbox/main_script.

The ENTRYPOINT for the Docker image will run the bash script entrypoint, which verifies that the /sandbox/main_script file exists.

The Docker image will launch the file and, as a result, run the command line entered by the user.

For more examples of how Saagie uses Docker images, refer to the Saagie repository.

2.2. Apps

There are currently no guidelines for creating Docker images for Saagie apps.

Check out these examples from Saagie’s official repository:

  • Zeppelin: no modification needed for the URL rewrite

  • Jupyter: modify the command line for the URL rewrite

  • RStudio: must use an NGINX server, which then manages URL rewrites

3. Docker resources

3.1. Best practices for Docker files

Refer to Docker’s official documentation on how to write a great Dockerfile:

3.2. Test Docker image compatibility

In order to confirm the compatibility and quality of your Docker images, test the container structure.

3.3. Saagie repository

Refer to official examples of Saagie jobs in Saagie’s repository.