About the Monitoring Module
By default, the Platform Overview page opens when you click the Monitoring module.
The page gives you an overview of your platform’s node consumption and reservation.
It is composed of a two-column table describing the consumption of CPU and RAM resources.
This is the entry point for monitoring the overall capacity of your platform and the health status of the associated nodes. Based on these information, you can adjust the limits and requests accordingly.
To have proper access to monitoring details, your administrator must have configured Saagie to isolate the workload between platforms. This is done by specializing your nodes per platform. If workload isolation has not been configured, the Monitoring module will not be fully operational, as all your cluster nodes will be displayed on your platform without disassociation. For more information, see Node Isolation.
Here is the example of Saagie installed with two platforms: One installation with your isolated workload and the other without.
If you have four nodes in your cluster, the Platform Overview page will only show the node(s) that are dedicated to the selected platform. If a node is not labeled, it will not be considered for Saagie runs and will not appear.
Demo1has one specialized node.
Demo2has two specialized nodes.
If you have four nodes in your cluster, the Platform Overview page will show all the cluster nodes on each platform without dissociation, even if no resources are used on those nodes by your platform.
Demo1displays all the cluster nodes without dissociation.
Demo2displays all the cluster nodes without dissociation.
App, Job, and Pipeline Resource Consumption Graphs
The Overview and Instances pages for apps, jobs, and pipelines also include graphs that allow you to track consumption as the job, app, or pipeline runs on the node.
It can help you quickly identify bottlenecks, debug jobs and apps going Out Of Memory, and better optimize resource usage on your platform.
You can click the graph line to display tooltips with more information, and you can zoom in on a specific period.