2023

April 2023

Here are the highlights of new and updated features for this release:

  1. Product Updates (2023.02)
    A new version of Saagie has been released with the following features:

    • New elements have been created to monitor resource consumption of pipelines and your cluster.

    • Pipeline functionality has been enhanced to include more advanced orchestration logic, such as conditions on environment variables and job status.

    • Saagie now supports Google Cloud Platform (GCP).

  2. Saagie’s Technology Repository Updates
    New technologies have been added.

Product Updates (2023.02)

Cluster and Pipeline Resource Monitoring

New resource monitoring elements have been added to monitor resource consumption of your cluster and pipelines.

At the cluster level, you can access the The "Operation" module icon is a thermometer. Operations module to see an overview of your cluster. This page displays the number of projects, jobs, pipelines, and apps created on each platform, as well as resource capacity metrics for CPU and RAM for each node in the platform.

In the The "Overview" page icon is a square divided into several other squares. Overview and The "Instances" page icon is three overlapping squares. Instances page of pipelines, you can access graphs displaying runtime and resource consumption metrics.

This added focus on resource monitoring in Saagie will allow data engineers and platform administrators to have a complementary view of clusters and pipelines to track performance and better optimize resource usage on their platforms.

Smart Conditions in Pipelines

You can now create new type of conditions to build more relevant pipelines:

  • Conditions based on environment variables

  • Conditions based on job status

These new conditions will allow you to implement advanced intelligence in your pipelines.

For more information, see About Conditions in Pipelines.

Saagie With Google Cloud Platform (GCP)

Saagie is now available on Google Cloud Platform (GCP).

Saagie’s Technology Repository Updates

The following technologies have been added to the official Saagie technology repository:

  • Embedded and External Job Technologies

Technologies New contexts

Dataiku DDS
Badge for EXTERNAL JOB

Datasets v11.0 EXPERIMENTAL
Scenarios v11.0 EXPERIMENTAL

dbt
Badge for EXTERNAL JOB

1.3 STABLE RECOMMENDED
1.4 STABLE RECOMMENDED

Google Cloud Data Transfer
Badge for EXTERNAL JOB

Amazon S3 transfer jobs EXPERIMENTAL
GCS transfer jobs EXPERIMENTAL

Google Cloud Dataflow
Badge for EXTERNAL JOB

Clone job EXPERIMENTAL
New job EXPERIMENTAL

Python
Badge for EMBEDDED JOB

3.11 STABLE RECOMMENDED

Do not forget to synchronize your Saagie repositories to keep them up to date.

January 2023

Here are the highlights of new and updated features for this release:

  1. Product Updates (2023.01)
    A new version of Saagie has been released with the following features:

    • New elements to monitor resource consumption have been created.

    • A new add-on, called Saagie Usage Monitoring, can be deployed as an app inside projects.

    • Pipeline functionality has been enhanced to allow context propagation between jobs in a pipeline.

    • Saagie will now be installed with a ready-to-use example project, which goal is to propose an intelligent learning pipeline able to detect feelings on movie reviews.

    • Saagie now supports Kubernetes 1.23.x and 1.24.x.

    • The product version naming pattern has changed.

  2. Saagie’s Technology Repository Updates
    New technology versions and external job technologies have been added.

Product Updates (2023.01)

Resource Monitoring

❗In Work❗

As node isolation is not fully operational, the section about the The "Monitoring" module icon is a heart with an electrocardiogram in it. Monitoring module is here for information purposes only.

Stay tuned, this feature will be fully available for the upcoming release!

New resource monitoring pages have been added throughout Saagie to monitor resource consumption, from a platform level down to a specific item.

At the platform level, you can access the The "Monitoring" module icon is a heart with an electrocardiogram in it. Monitoring module to see an overview of your platform.

This page displays the number of projects, jobs, pipelines, and apps created on the selected platform, as well as resource capacity metrics for CPU and RAM for each node in the platform.

If node isolation has not been configured for your platform, the The "Monitoring" module icon is a heart with an electrocardiogram in it. Monitoring module will not be fully operational, that is, no resource data will be displayed. However, you still have information about the number of platform, jobs, pipelines, and apps.

In the The "Overview" page icon is a square divided into several other squares. Overview page of jobs and apps, you can access new graphs displaying runtime and resource consumption metrics for the last running instance.

monitoring graph consumption jobs apps overview

Besides the resource consumption limits that can already be defined for jobs and apps, Saagie’s focus on monitoring will help data engineers and platform administrators quickly identify bottlenecks, debug memory-starved jobs and apps, and better optimize resource usage on the platform.

Saagie Usage Monitoring

The new Saagie Usage Monitoring add-on can be installed on your platforms as an app, to monitor:

  • The amount of jobs and apps created, with their high-level metadata.

  • Metrics on the execution time and status of jobs and pipelines.

  • Metrics on the global usage of the storage volume associated with Saagie.

This app, based on Grafana, is available as an app technology in the Saagie’s official technology repository and can be installed in any project.

This app requires some configuration to work. Click the information icon info circle to display the README help file directly in Saagie.
As this app is designed to display cross-project metrics, Saagie recommends deploying it in a dedicated administration project.

For more information, see Saagie Usage Monitoring.

Context Propagation Between Jobs in Pipelines

In addition to existing environment variables that are set at the global or project levels, you can now create environment variables inside a pipeline and use them to transfer information between jobs during a pipeline execution.

These variables can be dynamically modified by jobs as the pipeline execution progresses, with a table displaying for each job the input and output values of variables.

This feature allows you to build smarter pipelines and paves the way to conditions based on a pipeline environment variables.

For more information, see the Pipeline Overview Page.

Saagie Project Example

Saagie will now be installed with a ready-to-use example project, which goal is to propose an intelligent learning pipeline able to detect feelings on movie reviews. It is accessible from your platform’s project library.

For more information, see Starting With the Saagie Project Example

Kubernetes 1.23.x and 1.24.x Support

This new version of the Saagie installer is now also compatible with Kubernetes versions 1.23.x and 1.24.x.

For more information on supported versions of Kubernetes, see System Requirements.

Product Version Naming Convention

For clarity on the product version you are using, it will now follow a new naming convention made up of the year, and the product version increment for the year. For this version, it is 2023.01.

Saagie’s Technology Repository Updates

The following technology versions and external job technologies have been added to the official Saagie technology repository:

  • Embedded and External Job Technologies

  • App Technologies

Technologies New contexts

Bash
Badge for EXTERNAL JOB

debian11-bullseye STABLE RECOMMENDED

Java/Scala
Badge for EMBEDDED JOB

17 STABLE RECOMMENDED

Talend
Badge for EMBEDDED JOB

Use_Java_17 STABLE RECOMMENDED

GCP Cloud Functions
Badge for EXTERNAL JOB

Default EXPERIMENTAL

GCP Cloud Run
Badge for EXTERNAL JOB

Copy service EXPERIMENTAL
New service EXPERIMENTAL

Technologies New contexts

Apache Superset

2.0 EXPERIMENTAL

Grafana

9.2 STABLE RECOMMENDED

Metabase

0.44 STABLE

MLFlow Server

2.0 STABLE

Saagie Usage Monitoring

For Saagie 3.x STABLE

Do not forget to synchronize your Saagie repositories to keep them up to date.