Set up a connector to archive webpage data

Note

Microsoft 365 compliance is now called Microsoft Purview and the solutions within the compliance area have been rebranded. For more information about Microsoft Purview, see the blog announcement.

Use a Veritas connector in the Microsoft Purview compliance portal to import and archive data from webpages to user mailboxes in your Microsoft 365 organization. Veritas provides a Webpage Capture connector that captures specific webpages (and any links on those pages) in a specific website or an entire domain. The connector converts the webpage content to a PDF, PNG, or custom file format and then attaches the converted files to an email message and then imports those email items to user mailboxes in Microsoft 365.

After webpage content is stored in user mailboxes, you can apply Microsoft Purview features such as Litigation Hold, eDiscovery, and retention policies and retention labels. Using a Webpage Capture connector to import and archive data in Microsoft 365 can help your organization stay compliant with government and regulatory policies.

Overview of archiving webpage data

The following overview explains the process of using a connector to archive webpage content in Microsoft 365.

Archiving workflow for webpage data.

  1. Your organization works with the webpage source to set up and configure a Webpage Capture site.

  2. Once every 24 hours, the webpage sources items are copied to the Veritas Merge1 site. The connector also converts and attaches the content of a webpage to an email message.

  3. The Webpage Capture connector that you create in the compliance portal, connects to the Veritas Merge1 site every day and transfers the webpage items to a secure Azure Storage location in the Microsoft cloud.

  4. The connector imports the converted webpage items to the mailboxes of specific users by using the value of the Email property of the automatic user mapping as described in Step 3. A subfolder in the Inbox folder named Webpage Capture is created in the user mailboxes, and the webpage items are imported to that folder. The connector does this by using the value of the Email property. Every webpage item contains this property, which is populated with the email addresses provided when you configure the Webpage Capture connector in Step 2.

Before you begin

  • Create a Veritas Merge1 account for Microsoft connectors. To create this account, contact Veritas Customer Support. You will sign into this account when you create the connector in Step 1.

  • You need to work with Veritas support to set up a custom file format to convert the webpage items to. For more information, see the Merge1 Third-Party Connectors User Guide.

  • The user who creates the Webpage Capture connector in Step 1 (and completes it in Step 3) must be assigned the Data Connector Admin role. This role is required to add connectors on the Data connectors page in the compliance portal. This role is added by default to multiple role groups. For a list of these role groups, see the "Roles in the security and compliance centers" section in Permissions in the Security & Compliance Center. Alternatively, an admin in your organization can create a custom role group, assign the Data Connector Admin role, and then add the appropriate users as members. For instructions, see the "Create a custom role group" section in Permissions in the Microsoft Purview compliance portal.

  • This Veritas data connector is in public preview in GCC environments in the Microsoft 365 US Government cloud. Third-party applications and services might involve storing, transmitting, and processing your organization's customer data on third-party systems that are outside of the Microsoft 365 infrastructure and therefore are not covered by the Microsoft Purview and data protection commitments. Microsoft makes no representation that use of this product to connect to third-party applications implies that those third-party applications are FEDRAMP compliant.

Step 1: Set up the Webpage Capture connector

The first step is to access to the Data Connectors and create a connector for Web Page source data.

  1. Go to https://compliance.microsoft.com and then click Data connectors > Webpage Capture.

  2. On the Webpage Capture product description page, click Add connector.

  3. On the Terms of service page, click Accept.

  4. Enter a unique name that identifies the connector, and then click Next.

  5. Sign in to your Merge1 account to configure the connector.

Step 2: Configure the Webpage Capture connector on the Veritas Merge1 site

The second step is to configure the Webpage Capture connector on the Veritas Merge1 site. For information about how to configure the Webpage Capture connector, see Merge1 Third-Party Connectors User Guide.

After you click Save & Finish, the User mapping page in the connector wizard in the compliance portal is displayed.

Step 3: Map users and complete the connector setup

To map users and complete the connector setup in the compliance portal, follow the steps below:

  1. On the Map Webpage Capture users to Microsoft 365 users page, enable automatic user mapping. The Webpage Capture items include a property called Email, which contains email addresses for users in your organization. If the connector can associate this address with a Microsoft 365 user, the items are imported to that user's mailbox.

  2. Click Next, review your settings, and go to the Data connectors page to see the progress of the import process for the new connector.

Step 4: Monitor the Webpage Capture connector

After you create the Webpage Capture connector, you can view the connector status in the compliance portal.

  1. Go to https://compliance.microsoft.com and click Data connectors in the left nav.

  2. Click the Connectors tab and then select the Webpage Capture connector to display the flyout page. This page contains the properties and information about the connector.

  3. Under Connector status with source, click the Download log link to open (or save) the status log for the connector. This log contains information about the data that's been imported to the Microsoft cloud. For more information, see View admin logs for data connectors.

Known issues

  • At this time, we don't support importing attachments or items that are larger than 10 MB. Support for larger items will be available at a later date.