Site Map Introduction What’s New? 2023 2022 2021 2020 Getting Started Guide Getting Familiar with the Interface Managing Your User Profile Starting With the Saagie Project Example Data Team Guide Projects Module About the Projects Module Projects Managing Your Projects Project Settings Jobs About Jobs Managing Your Jobs Job Settings Pipelines About Pipelines About Conditions in Pipelines Managing Your Pipelines Pipeline Settings Apps About Apps Managing Your Apps App Settings Storage About Storage Managing Environment Variables Managing Docker Credentials Monitoring Module About the Monitoring Module Operations Module About the Operations Module Catalog Module About the Technology Catalog Repositories About Repositories Managing Your Repositories Repository Settings Technologies of the Official Saagie Repository Security Module About the Security Module Managing Your User Accounts Managing Your Groups Add-On Saagie Usage Monitoring About Saagie Usage Monitoring Saagie Usage Monitoring Default Dashboards Customizing Your App Developer Guide Software Development Kit (SDK) Creating and Managing Technologies Creating the Metadata Files and the Zip Archive Creating a Repository References About Docker Images Within Saagie Application Programming Interface (API) Saagie GraphQL API Saagie Python API Administrator Guide About Saagie Architecture How Does Saagie Work? Interactions Between Saagie and External Components Prometheus Conf’o’rama Installing and Setting Up Saagie System Requirements Creating and Configuring Your Kubernetes Cluster Using Amazon Elastic Kubernetes Service (EKS) Using Microsoft Azure Kubernetes Service (AKS) Using Google Kubernetes Engine (GKE) Using Another Service Platform Downloading and Configuring Saagie Running the Saagie Installer Creating a Domain Name System (DNS) Entry Deploying and Updating Your SSL Certificate Enabling LDAP Authentication Upgrading Saagie Maintaining Your Platform Starting and Stopping Saagie Adding a Node in Your Saagie Cluster Changing the Data Lake URL in Saagie Monitoring Technologies Monitoring Logs Audit Logs Operational Logs How-To Guides Apache Sqoop Import Data From a MySQL, Oracle, PostgreSQL, or SQL Server Database Import Data From Other Relational Database Management System (RDBMS) Apache Spark Configure Spark Resources Read and Write Files or Tables With PySpark Read and Write Files From Amazon S3 Bucket With PySpark Read and Write Files From HDFS With PySpark Read and Write Tables From Hive With PySpark Read and Write Files or Tables With Spark Scala Read and Write Files From HDFS With Spark Scala Read and Write Tables From Hive With Spark Scala Package Your Spark Scala Code With the Assembly Plugin Integrate Spark Streaming With Kafka Airbyte Airbyte Prerequisites Use Airbyte As a Saagie App Use Airbyte As an API HDFS Manually Upload Large Files to HDFS Read and Write Files From HDFS, WebHDFS, and HTTPFS With HDFS Java/Scala Read and Write Files or Tables With Java/Scala Read and Write Files From HDFS With Java/Scala Read and Write Tables From Hive With Java/Scala Read and Write Tables From Impala With Java/Scala Read and Write Files From MongoDB With Java/Scala Use Java/Scala on Impala With High Availability Kerberos Adapt Your Job to Connect to Kerberos Notebooks Create RStudio User Accounts Configure Git on RStudio Use Generative AI in Jupyter Notebook Python Read and Write Files or Tables With Python Read and Write Files From Amazon S3 Bucket With Python Read and Write Files From Elasticsearch With Python Read and Write Files From HDFS With Python Read and Write Tables From Hive With Python Read and Write Tables From Impala With Python Read and Write Files From MongoDB With Python Read and Write Tables From MySQL With Python Read and Write Tables From PostgreSQL With Python Use Python on HDFS or Impala With High Availability R Use R With a Proxy Update R Packages Use External R Packages Create Hive Dynamic Tables Read and Write Files or Tables With R Read and Write Files From Amazon S3 Bucket With R Read and Write Files From Elasticsearch With R Read and Write Files From HDFS With R Read and Write Tables From Hive With R Read and Write Tables From Impala With R Read and Write Files From MongoDB With R Use R on HDFS With High Availability Use R on Impala With High Availability Talend Read and Write Files or Tables With Talend Read and Write Files From Amazon S3 Bucket With Talend Read, Write, and List Files From HDFS With Talend Read and Write Tables From Hive With Talend Read and Write Tables From Impala With Talend Read Files From MongoDB With Talend Copy a File From HDFS to the Local Computer Produce and Consume a Message in the Kafka Cluster Use Talend on HDFS With High Availability Frequently Asked Question (FAQ) What are the basic Kubernetes commands I need to know to use Saagie? How do I connect to a container? How can I see the component logs? How can I increase the storage size of a Persistent Volume Claim (PVC)? How do I restore a component? How can I contact Saagie support? How do I request a feature? Glossary Site Map Glossary