Lookup activity output into a CSV file in ADLS

Bhajanthri Bala Sujay 0 Reputation points
2024-04-26T16:14:09.8366667+00:00

I'm using lookup activity to read all tables from a Oracle SQL Developer application for a particular schema.

The query is SELECT TABLE_NAME FROM all_tables WHERE owner = 'Schema_Name'

Here the scheme name is dynamic taking value from a variable.

Now, my query is how can I store all the table names in a csv file in a ADLS for every particular schema.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,614 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Abhishek Tripathi 0 Reputation points
    2024-04-27T03:58:04.4933333+00:00

    GPT Answer-

    Certainly! To achieve this, you can follow these steps using Azure Data Factory:

    Lookup Activity to Retrieve Table Names:

    • Create a Lookup activity in your pipeline that executes the query SELECT TABLE_NAME FROM all_tables WHERE owner = 'Schema_Name'.
      • The output of this lookup activity will be a list of table names.
      Set Variable Activity:
      - After the Lookup activity, use a **Set Variable activity** to store the output of the lookup (i.e., the list of table names) in a pipeline variable.
      
         - Set the variable value to `@{activity('Lookup1').output.value}` (replace `'Lookup1'` with the actual name of your Lookup activity).
      
         **Create a CSV File**:
      
            - In your Azure Data Lake Storage (ADLS) Gen2, create a dummy CSV file (e.g., `table_names.csv`) with a single column (e.g., `TableName`).
      

    Copy Data Activity:

    • Use a Copy Data activity to write the table names from the pipeline variable into the CSV file.
    • Configure the source dataset to use the pipeline variable (the list of table names).
      • Set the sink dataset to point to the ADLS Gen2 location where you want to store the CSV file (table_names.csv).
      Mapping and Schema:
      - In the Copy Data activity, map the source column (table names) to the sink column (`TableName` in the CSV file).
      
         - Ensure that the schema of the CSV file matches the data type of the table names.
      
         **Execute the Pipeline**:
      
            - Trigger your pipeline to execute.
      
               - The Copy Data activity will write the table names to the specified CSV file in ADLS Gen2.
      

    Remember to replace placeholders like 'Schema_Name', 'Lookup1', and adjust the dataset configurations according to your specific setup. This approach allows you to dynamically retrieve table names and store them in a CSV file for each schema in ADLS Gen21234.

    0 comments No comments

  2. phemanth 5,840 Reputation points Microsoft Vendor
    2024-04-29T06:55:20.51+00:00

    @Bhajanthri Bala Sujay Thanks for reaching out to Microsoft Q&A

    Here's how you can achieve your goal of storing table names in a CSV file per schema in ADLS using Azure Data Factory (ADF) with a Lookup activity:1. ADF Pipeline Setup:

    • Create a new ADF pipeline.
    • Add a Lookup activity to the pipeline.

    2. Lookup Activity Configuration:

    • Source Dataset: Configure a dataset pointing to your Oracle database.
    • Connection: Set up a connection linked service for your Oracle database.
    • Schema: Leave the schema blank as you'll dynamically define it in the query.
    • Table: Set the table name to all_tables.

    3. Lookup Activity Query:

    This is where the magic happens:

    SELECT TABLE_NAME
    FROM all_tables
    WHERE owner = '@{variables('SchemaName')}'
    
    • Replace 'Schema_Name' with the variable name holding the dynamic schema name. ADF variables are accessed using @{variables('VariableName')} syntax.

    4. Sink Dataset:

    • Configure a sink dataset pointing to your ADLS storage location.
    • Format: Set the format to delimited text (CSV).
    • File name: Here's how to dynamically create a unique filename per schema:
    @concat('schema_', lower(@{variables('SchemaName')}), '.csv')
    

    This expression constructs a filename with the prefix "schema_", followed by the schema name converted to lowercase (optional for consistency), and ending with the ".csv" extension.

    5. Pipeline Run:

    • Set the SchemaName variable value before running the pipeline. You can do this through ADF parameters or trigger configuration.
    • Run the pipeline. The Lookup activity will retrieve table names for the specified schema and write them to a separate CSV file in ADLS named after the schema.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.