Saagie Usage Monitoring Default Dashboards

Saagie Usage Monitoring (SUM) comes with defaults dashboards. A dashboard in Grafana is a set of one or more panels organized and arranged into one or more rows. Dashboards and panels allow you to show your data in visual form.

When installing SUM, you must set several environment variables, including the MONITORING_OPT environment variable. This variable allows you to choose the default configuration of your dashboards by choosing between the following values:

  • SAAGIE: Choose this value if you want to monitor only Saagie jobs, apps, and pipelines. This is the default value.

  • SAAGIE_AND_DATALAKE: Choose this value if you want to monitor Saagie jobs, apps, and pipelines, as well as your HDFS data lake.

  • SAAGIE_AND_S3: Choose this value if you want to monitor Saagie jobs, apps, and pipelines, as well as your S3 buckets.

If you choose the SAAGIE option, you will have the following three dashboards:

If you choose the SAAGIE_AND_DATALAKE option, you will have the dashboards included with the SAAGIE option, and additional dashboards with information about your HDFS storage space.

If you choose the SAAGIE_AND_S3 option, you will have the dashboards included with the SAAGIE option, and an additional dashboard with information about your S3 storage space.

Dashboards are accessible from the Dashboards tab. Click the burger menu burger  Dashboards.

add on sum dashboards

Saagie – Job Count

This page gives you information about all the jobs and apps of your platforms.

add on sum job count

You can see information on the total number of jobs and apps on your Saagie platforms, their distribution and evolution in number per project, and details for each app and job.

Click a legend item of a graph to display only the corresponding information. You can also view multiple items at once by pressing ctrl and clicking the desired legend items. To reset the graph, click a legend item twice.

Saagie – Next Scheduling

This page gives you information on the capacity planning of your jobs and pipelines.

add on sum job pipeline next scheduling

You can view information about the jobs and pipelines that have been completed, as well as the schedule for the upcoming jobs and pipelines.

Runtimes for upcoming jobs and pipelines are calculated on the basis of the average duration for previous jobs and pipelines, up to one month in advance. Note that the planning of upcoming jobs and pipelines is only displayed for pipelines and jobs already scheduled in Saagie. Schedules for items planned with an external scheduler are not displayed here.

Use the filters to display one or more specific projects, only jobs or pipelines, and only scheduled jobs and pipelines.

Hover over an element in the graphs to display a tooltip with more details. Click the element to keep the tooltip open. Click the element again or anywhere else in the graph to close it.

Saagie – Pipelines & Jobs Evolution

This page gives you information about the execution of your jobs and pipelines.

add on sum job pipeline evolution

You can see information on the evolution of job status, the jobs and pipelines execution time, the average execution time for each job and its success rate.

  • Hover over an element in the graphs to display a tooltip with more details.

  • Click a legend item of a graph to display only the corresponding information. You can also view multiple items at once by pressing ctrl and clicking the desired legend items. To reset the graph, click a legend item twice.

Saagie – Jobs & Pipelines Execution

This page gives you information on failed and killed jobs and pipelines.

add on sum job pipeline execution

This dashboard consists of two sections, each with two graphs. There is a section for jobs and another for pipelines. Each section includes a graph showing a timeline of failed and killed jobs or pipelines, and another graph showing the overall timeline of jobs or pipelines.

Click or hover over an element in the graphs to display a tooltip with more details about a job or pipeline. You can see the job or pipeline name, date, failed or killed instance, status, and runtime.

In the Jobs Failed/Killed Timeline and Pipelines Failed/Killed Timeline graphs, when clicking a job or pipeline instance, a detailed tooltip is displayed. You can click open external Job or open external Pipeline (1) to open the instance of the job or pipeline in Saagie.

Red dots are for failed jobs or pipelines and yellow dots are for killed jobs or pipelines.

S3 – Global Usage

This page gives you information about the storage space you have on your S3.

add on sum s3 global usage

You can see information about the total space used by all your S3 buckets, their total number of objects and the size of each bucket. The size of your S3 buckets is presented through a table and a graph.

Data Lake - Disk Space Explorer

This page gives you information about the storage space you have on your HDFS, especially on the first level directories.

add on sum data lake disk space explorer

You can see information about the size of first level directories in the Folder size dashboard, and about the number of files per first level directories in the Number of files dashboard.

Hover over an element in the graphs to display a tooltip with more details.

Data Lake - Global Usage

This page gives you information about the overall storage space you have on your HDFS.

add on sum data lake global usage

This page allows you to see the history of the size of the directories, and the number of files. You can also see information about the average file size, and the storage space used compared to the overall capacity.

See also