About Jobs

The The "Jobs" page icon is a 3D pyramid of three squares. Jobs page lets you access the library of embedded of a project.

What is a job?: A job is a computation task performed inside projects on Saagie. Jobs run through a command line and can be launched individually or as part of a data pipeline.

Jobs are listed with some basic information, such as the name (a), technology used (b), job type (c), status (d), and last instance executed (e).

From the job library, you can also run a job (f), upgrade it, duplicate its Current version, move it to another project, go to its last instance, or delete it (g).

A clock

icon (h) is displayed for scheduled jobs. Hover over the icon to view information about the next job run.

When you create a job, you select a category and a technology for it. The technologies that are available in each category depend on the technologies that you selected when you created the project.

If the technology you need does not appear in the list of available technologies, that means it was not selected when the project was created. You will either have to choose another technology or update the project settings to include the technology you need. Besides, the category and technology of a job cannot be changed after the job is created. If you want to change them, you will have to create a new job.

A job category does not impact the job execution, it helps you to organize your jobs. There are three default job categories:

Extraction: For jobs that retrieve data.
Processing: For jobs that process data.
Smart Apps: For jobs that use or expose data.

When you select your technology, it is important to note that the technologies have different requirements. For more information, see the following table:

Table 1. Job requirements by technology
Technology	File type	Default `shell` command
Bash	Any file type (Optional)	`echo "Saagie Bash"`
Generic	Docker image	none
Java/Scala	`.jar`	`java -jar {file} arg1 arg2`
Python	`.py` or `.zip`	`python {file} arg1 arg2`
R	`.r`	`Rscript {file} arg1 arg2`
Spark	`.jar`	`spark-submit --class=Main {file} arg1 arg2`
Sqoop	Any file type (Optional)	`driver=xxx` `host="x.x.x.x"` `port=xxx` `username="xxx"` `password="xxxx"` `database="xxxx"` `table="xxxx"` `hdfsdest=hdfs:///tmp/sqoop_import` `sqoop import --connect jdbc:$driver://$host:$port/$database --username $username --password $password --as-textfile -m 1 --target-dir $hdfsdest --table "$table"`
Talend	`.zip`	`sh {file} arg1 arg2`

Click a job to access its:

Overview page
Instances page
Versions page

Overview Page

The The "Overview" page icon is a square divided into several other squares. Overview page gives general information on your job.

By default, the page opens when you click a job in the project’s job library.

Screenshot of the "Overview" page of a job.

The first part of the page (1) gives general information on the job, such as category, last instance details, version used, name, and alias.

It allows you to manage your job settings and upgrade your job to create a new version of it. It also allows you to duplicate the Current job version, move the job to another project, or delete it (a).

You can also view the logs of your job.

For embedded jobs, you can choose to view and download Saagie Logs, Pipeline Variable Logs, or Error Logs Only for each.

The second part of the page (2) gives information on job runtime and consumption through a timeline and various graphs.

You can use the timeline to see the execution time of the running and terminated job, along with the different types of status it has gone through. This allows you to determine the performance of your job. If it is not effective enough, you can optimize it accordingly. You can then use the graphs to check your job consumption during and after its execution.

Click on a job timeline step to see its details.
Click the info icon in the timeline caption to know more about what happens while your instance is queued. You can view event details as soon as the Requested status has started.
Hover over the graph caption to see on which node the job has been executed.
Select a range on the graph, or the timeline to zoom in on the selection. Click Reset range to display all details again.

Monitoring the RAM consumption of your job can help you anticipate potential memory issues. Indeed, a job that consumes more than the available RAM limit goes into an out of memory Out Of Memory (OOM) state.

You can define a RAM limit for your job in its settings. If you have not defined a RAM limit, the job will run according to the overall RAM capacity of the node. In both cases, adjust the RAM limit for your node or job to ensure successful execution.

For more information on monitoring your platform resources, see About the Monitoring Module.

The third part of the page (3) gives information on the pipelines related to the job and on the technology. It also gives information on the job package.

When you push a job via the Saagie CI/CD process, a link to the source code is added to the package of the job.

Figure 1. Link to the source code generated by CI/CD on a job.

You cannot change this link by hand. Also, if you change your job package via the Saagie platform, the link to the source code is removed as it is no longer relevant. However, it remains accessible from the corresponding version of your job.

Instances Page

The The "Instances" page icon is three overlapping squares. Instances page gives information on your job instances and allows you to keep track of all executed instances.

What is an instance?: An instance is a single run of a job or pipeline in a project. The execution information and logs of all instances are saved on your platform.

Each time you run a job, you create a new instance of it. All instances are saved and remain accessible. They are listed on the right side of the page (1). You can see their information by selecting them in the list. You can also delete instances from this list.

By default, the page opens on the last executed instance of the selected job.

Screenshot of the "Instances" page of a job.

The first part of the page (2) gives general information on the job instance, such as the instance number (specified by #001 in the title), execution details, such as the status, start date and author, and end time. It also indicates the job version used.

The second part of the page (3) gives information about the job logs.

For embedded jobs, you can choose to view and download Saagie Logs, Pipeline Variable Logs, or Error Logs Only for each.

The third part of the page (4) gives information on the consumption and runtime of the selected job instance through a timeline and various graphs.

Click on a job timeline step to see its details.
Click the info icon in the timeline caption to know more about what happens while your instance is queued. You can view event details as soon as the Requested status has started.
Hover over the graph caption to see on which node the job has been executed.
Select a range on the graph, or the timeline to zoom in on the selection. Click Reset range to display all details again.

Versions Page

The The "Versions" page icon is a folder with an arrow pointing up. Versions page gives information on the version of your job. It also keeps track of all previous versions.

What is a version?: A version is a single iteration of a job, pipeline, or app. Each new upgrade is stored as a version, so you can roll back to previous iterations and keep track of the changes that have been made.

Each time you upgrade a job, you create a new version of it. This version is automatically defined as the Current version.

All versions are saved and remain accessible. They are listed on the right side of the page (1), and you can see the information of a version by selecting it in the list. You can also delete versions from this list.

You can switch back and forth between versions by selecting a version from the list and clicking Rollback to this version (2). The selected version then becomes the new Current version.

You can define a version as major to highlight the most stable job version. Select a version from the list and click Set as major version (3). The version appears with the Major version label and sparks in front of its line. Click Unset as a major version to remove the label from a version.

By default, the page opens on the version of the job in use, tagged with the Current badge.

Screenshot of the "Versions" page of a job.

The first part of the page (4) gives general information on the version, such as the release note, creation date and creator, and execution context.

The second part of the page (5) displays:

For embedded jobs, information about the job package.