SSIS integration to run simple clustering task

yosafat saragih 21 Reputation points
2021-02-19T03:49:02.06+00:00

Dear all SSIS fellow,

Right now I work on SSIS pipeline to make automatic feature engineering which also have simple AI process to make the class. In the pipeline, we need to make a cluster based on simple feature (8 column pulled from database directly) to declare it as the class, and join it with our SSIS data cleaning process. The source and destination database are PostgreSQL and not in Azure environment. Since I working with Python mainly to make the AI part, I'm very new to C# environment of SSIS. Is there any possibilities to use Cython module to work the clustering or another possible option to run simple clustering process on SSIS?

SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
2,459 questions
0 comments No comments
{count} votes

Accepted answer
  1. Yitzhak Khabinsky 25,106 Reputation points
    2021-02-22T03:09:49.497+00:00

    @yosafat saragih ,

    You can safely ignore Mona's answer. It is pertaining to the SQL Server and SSIS multi-server failover clustering for high availability.

    You can definitely use SSIS Script Task with any c# libraries for your needs.


2 additional answers

Sort by: Most helpful
  1. Monalv-MSFT 5,896 Reputation points
    2021-02-19T07:46:45.243+00:00

    Hi @yosafat saragih ,

    Clustering Integration Services is not recommended because the Integration Services service is not a clustered or cluster-aware service, and does not support failover from one cluster node to another. Therefore, in a clustered environment, Integration Services should be installed and started as a stand-alone service on each node in the cluster.

    Please refer to Integration Services (SSIS) in a Cluster and Clustering SSIS.

    Best regards,
    Mona

    ----------

    If the answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. yosafat saragih 21 Reputation points
    2021-02-22T02:27:02.037+00:00

    How about if i put the process in the script task, I have read that C# has several ML library such as Accord.NET or ML-NET? I mean the data clustering process are single task which not need additional server (in this case, i run my SSIS in my local environment which connected to external server).

    0 comments No comments