2023
July 2023
Here are the highlights of new and updated features for this release:
-
Product Updates (2023.03)
The2023.03
version of the Saagie product has been released with the following features:-
You can now delete job instances and versions.
-
You can now duplicate a job.
-
Default values for CPU and RAM resources have been defined for all technologies, except for external technologies. In addition, these resource capacities are now enabled by default when creating jobs and apps with the predefined default values.
-
-
Saagie Python API Documentation
Read the documentation to use our Python packagesaagieapi
and interact with the Saagie platform in Python. -
-
A patch has been released to handle ambiguous floating values in the attributes of the technology’s
metadata.yaml
files. -
Job execution lasting more than 15 minutes now end with an appropriate status, instead of
Unknown
. -
A pagination has been implemented on the app History page to improve page loading fluidity.
-
-
Saagie Technology Repository Updates
New technologies have been added and others deprecated.
Product Updates (2023.03)
Deleting a Job Instance
From the job’s Instances page, you can now delete instances and associated logs to streamline the list, improve your user experience, and maintain control over storage.
You can either delete a single instance, a selection of instances, or a selection of instances based on status filters.
For more information, see Deleting Job Instances and Job Versions.
Deleting a Job Version
From the job’s Versions page, you can now delete versions to streamline the list and improve your user experience.
You can either delete a single version, or a selection of versions.
For more information, see Deleting Job Instances and Job Versions.
Duplicating a Job
From the job library or its Overview page, you can now duplicate the
Current
version of your job. This saves you having to start from scratch, and improves your productivity.
For more information, see Duplicating a Job Version
Default Resource Allocation for All Technologies and Contexts
To increase the reliability of job and app execution, better share limited resources with others, and guarantee simultaneous execution our internal system has been enhanced.
Default values for CPU and RAM resources have been defined to all technologies and contexts in Saagie’s technology catalog, except for external technologies.
These values ensure greater platform stability. You can see the details by clicking the technology in Catalog >
Repositories > Saagie.
These values also exist at the technology context level and can override the values defined at the technology level.
You can configure them when creating a job or app, or by modifying the Resources setting of your job or app.
|
In addition, the catalog schemas have been updated with new optional fields to add default values to the technologies in your custom repositories. If this field is left blank, the default values will be 1 CPU and 500 MB RAM. For more information, see Type-Specific Attribute Tables.
Saagie Python API Documentation
You can use our Python package saagieapi
, which implements Python API wrappers to easily interact with the Saagie platform in Python.
For more information, see Saagie Python API documentation.
Bug Fixes
Handle Ambiguous Floating Values
Each technology has its own metadata.yaml
file composed of a variety of attributes requiring different types of values.
The parser is sensitive to float ambiguity when the attribute expects a value of type string.
This has a particular impact on the technology version number.
For example, if you have Python 3.10
, it will be read as 3.1
and not 3.10
.
To remove this ambiguity in version 2023.03
of the technology catalog, you must:
-
Modify your technology’s
metadata.yaml
file by adding quotation marks to the value of attributes requiring a string value. For example, writeid: "3.10"
instead ofid: 3.10
. -
Duplicate the technology context. One of the versions will have the identifier
3.2
and will be marked DEPRECATED. The other version will be identical, but with the identifier
3.20
.
This concerns all attributes requiring a string value.
Saagie’s official technology repository will be updated automatically without any action on your part. |
Job Status Unknown
Jobs lasting more than 15 minutes were automatically assigned the Unknown
status.
They now end with an appropriate status.
Loading App History
To solve performance issues of the app History page, a pagination has been implemented.
Events are loaded progressively rather than all at once, improving page loading time and fluidity.
In addition, the timeline display on the app Overview page has also been modified accordingly. If your app history contains too many events, only the most recent will be displayed. Part of the beginning of the timeline will be grayed out to indicate that the oldest events cannot be displayed.
Saagie Technology Repository Updates
The following technologies have been added or deprecated in the Saagie official technology repository:
Technology | New contexts | Deprecated contexts |
---|---|---|
Bash |
|
- |
Python |
- |
|
Technology | New contexts |
---|---|
Airbyte |
|
VS Code |
|
Do not forget to synchronize your Saagie repositories to keep them up to date. |
April 2023
Here are the highlights of new and updated features for this release:
-
Product Updates (2023.02)
The2023.02
version of the Saagie product has been released with the following features:-
New elements have been created to monitor resource consumption of pipelines and your cluster.
-
Pipeline functionality has been enhanced to include more advanced orchestration logic, such as conditions on environment variables and job status.
-
Saagie now supports Google Cloud Platform (GCP).
-
-
Saagie Technology Repository Updates
New technology versions have been added.
Product Updates (2023.02)
Cluster and Pipeline Resource Monitoring
New resource monitoring elements have been added to monitor resource consumption of your cluster and pipelines.
At the cluster level, you can access the Operations module to see an overview of your cluster.
This page displays the number of projects, jobs, pipelines, and apps created on each platform, as well as resource capacity metrics for CPU and RAM for each node in the platform.
In the Overview and
Instances page of pipelines, you can access graphs displaying runtime and resource consumption metrics.
This added focus on resource monitoring in Saagie will allow data engineers and platform administrators to have a complementary view of clusters and pipelines to track performance and better optimize resource usage on their platforms.
Smart Conditions in Pipelines
You can now create new type of conditions to build more relevant pipelines:
-
Conditions based on environment variables
-
Conditions based on job status
These new conditions will allow you to implement advanced intelligence in your pipelines.
For more information, see About Conditions in Pipelines.
Saagie With Google Cloud Platform (GCP)
Saagie is now available on Google Cloud Platform (GCP).
Saagie Technology Repository Updates
The following technologies have been added to the official Saagie technology repository:
Technology | New contexts |
---|---|
Dataiku DDS |
|
dbt |
|
Google Cloud Data Transfer |
|
Google Cloud Dataflow |
|
Python |
|
Do not forget to synchronize your Saagie repositories to keep them up to date. |
January 2023
Here are the highlights of new and updated features for this release:
-
Product Updates (2023.01)
The2023.01
version of the Saagie product has been released with the following features:-
New elements to monitor resource consumption have been created.
-
A new add-on, called Saagie Usage Monitoring, can be deployed as an app inside projects.
-
Pipeline functionality has been enhanced to allow context propagation between jobs in a pipeline.
-
Saagie will now be installed with a ready-to-use example project, which goal is to propose an intelligent learning pipeline able to detect feelings on movie reviews.
-
Saagie now supports Kubernetes
1.23.x
and1.24.x
. -
The product version naming pattern has changed.
-
-
Saagie Technology Repository Updates
New technology versions and external job technologies have been added.
Product Updates (2023.01)
Resource Monitoring
New resource monitoring pages have been added throughout Saagie to monitor resource consumption, from a platform level down to a specific item.
At the platform level, you can access the Monitoring module to see an overview of your platform.
This page displays the number of projects, jobs, pipelines, and apps created on the selected platform, as well as resource capacity metrics for CPU and RAM for each node in the platform.
If node isolation has not been configured for your platform, the |
In the Overview page of jobs and apps, you can access new graphs displaying runtime and resource consumption metrics for the last running instance.
Besides the resource consumption limits that can already be defined for jobs and apps, Saagie’s focus on monitoring will help data engineers and platform administrators quickly identify bottlenecks, debug memory-starved jobs and apps, and better optimize resource usage on the platform.
Saagie Usage Monitoring
The new Saagie Usage Monitoring add-on can be installed on your platforms as an app, to monitor:
-
The amount of jobs and apps created, with their high-level metadata.
-
Metrics on the execution time and status of jobs and pipelines.
-
Metrics on the global usage of the storage volume associated with Saagie.
This app, based on Grafana, is available as an app technology in the Saagie’s official technology repository and can be installed in any project.
This app requires some configuration to work.
Click the information icon |
As this app is designed to display cross-project metrics, Saagie recommends deploying it in a dedicated administration project. |
For more information, see Saagie Usage Monitoring.
Context Propagation Between Jobs in Pipelines
In addition to existing environment variables that are set at the global or project levels, you can now create environment variables inside a pipeline and use them to transfer information between jobs during a pipeline execution.
These variables can be dynamically modified by jobs as the pipeline execution progresses, with a table displaying for each job the input and output values of variables.
This feature allows you to build smarter pipelines and paves the way to conditions based on a pipeline environment variables.
For more information, see the Pipeline Overview Page.
Saagie Project Example
Saagie will now be installed with a ready-to-use example project, which goal is to propose an intelligent learning pipeline able to detect feelings on movie reviews. It is accessible from your platform’s project library.
For more information, see Starting With the Saagie Project Example
Kubernetes 1.23.x and 1.24.x Support
This new version of the Saagie installer is now also compatible with Kubernetes versions 1.23.x
and 1.24.x
.
For more information on supported versions of Kubernetes, see System Requirements.
Product Version Naming Convention
For clarity on the product version you are using, it will now follow a new naming convention made up of the year, and the product version increment for the year.
For this version, it is 2023.01
.
Saagie Technology Repository Updates
The following technologies have been added in the official Saagie technology repository:
Technology | New contexts |
---|---|
Bash |
|
Java/Scala |
|
Talend |
|
GCP Cloud Functions |
|
GCP Cloud Run |
|
Technology | New contexts |
---|---|
Apache Superset |
|
Grafana |
|
Metabase |
|
MLFlow Server |
|
Saagie Usage Monitoring |
For Saagie |
Do not forget to synchronize your Saagie repositories to keep them up to date. |