Read Files From MongoDB With Talend

How to read files from MongoDB with Talend.

Before you begin:

To follow this procedure, you must have a Repository containing the information of the Saagie platform. To this end, create a context group and define the contexts and variables required.

In this case, define the following context variables:

Name Type Value

IP_MongoDB

String

Your MongoDB IP address

Port_MongoDB

String

Your MongoDB port

BDD_MongoDB

String

Your MongoDB database

New_Collection_MongoDB

String

Your MongoDB collection

User_MongoDB

String

The MongoDB user authentication name

Password_MongoDB

String

The MongoDB user account password

Once your context group is created and stored in the Repository, you can apply the Repository context variables to your jobs.

For more information, read the whole section on Using contexts and variables on Talend Studio User Guide.
This tutorial is an example of how to count lines.
  1. Create a new job in Talend.

  2. Add the following components:

    • tMongoDBConnection, to create a connection to a MongoDB database and reuse that connection in other components.

    • tMongoDBInput, to extract data from the MongoDB database collection and send the data to the following component.

    • tAggregateRow, to receive a flow and aggregate it based on one or more columns.

    • tLogRow, to display the result.

  3. Link these components as follows:

    mongodb read file with talend

  4. Double-click each component and configure their settings as follows:

    • tMongoDBConnection

    • tMongoDBInput

    • tAggregateRow

    • tLogRow

    In the Basic settings tab:

    1. From the DB Version list, select your database version.

    2. In the Server and Port fields, enter the IP address and listening port of the database server.

      Available when the Use replica set address option is not selected.
    3. In the Database field, enter the name of the database.

    4. Select the Required authentication option to enable the database authentication.

    5. In the Authentication mechanism list, select the authentication mechanism the most adapted to the MongoDB version you are using.

      The NEGOTIATE mechanism is recommended if you are not using Kerberos.
    6. In the Username and Password fields, enter the database user authentication data.

      Available when the Required authentication option is selected.
    For more information, you can refer to Talend’s documentation on the tMongoDBConnection component.

    In the Basic settings tab:

    1. Select the Use an existing connection option.

    2. From the Component List list, select the connection component tMongoDBConnection to reuse the connection details already defined.

    3. In the Database field, enter the name of the database.

    4. Click Edit schema to make changes to the schema, which defines the number of columns to be processed and passed on to the next component.

    5. In the Query field, you can enter your query.

    For more information, you can refer to Talend’s documentation on the tMongoDBInput component.

    In the Basic settings tab:

    1. Click Edit schema to make changes to the schema, which defines the number of columns to be processed and passed on to the next component. Add the sum column as output.

    2. In the Operations field, select the type of operation along with the value to use for the calculation, and the output field.

    For more information, you can refer to Talend’s documentation on the tAggregateRow component.
    For more information, you can refer to Talend’s documentation on the tLogRow component.
  5. Run the job.