Use smart URL refresh with a project

Custom question answering gives you the ability to refresh your source contents by getting the latest content from a source URL and updating the corresponding project with one click. The service will ingest content from the URL and either create, merge, or delete question-and-answer pairs in the project.

This functionality is provided to support scenarios where the content in the source URL changes frequently, such as the FAQ page of a product that's updated often. The service will refresh the source and update the project to the latest content while retaining any manual edits made previously.

Note

This feature is only applicable to URL sources, and they must be refreshed individually, not in bulk.

Important

This feature is only available in the 2021-10-01 version of the Language API.

How it works

If you have a project with a URL source that has changed, you can trigger a smart URL refresh to keep your project up to date. The service will scan the URL for updated content and generate QnA pairs. It will add any new QnA pairs to your project and also delete any pairs that have disappeared from the source (with exceptions—see below). It also merges old and new QnA pairs in some situations (see below).

Important

Because smart URL refresh can involve deleting old content from your project, you may want to create a backup of your project before you do any refresh operations.

You can trigger a URL refresh in Language Studio by opening your project, selecting the source in the Manage sources list, and selecting Refresh URL.

screenshot of language studio with refresh URL button highlighted.

You can also trigger a refresh programmatically using the REST API. See the Update Sources reference documentation for parameters and a sample request.

Smart refresh behavior

When the user refreshes content using this feature, the project of QnA pairs may be updated in the following ways:

Delete old pair

If the content of the URL is updated so that an existing QnA pair from the old content of the URL is no longer found in the source, that pair is deleted from the refreshed project. For example, if a QnA pair Q1A1 existed in the old project, but after refreshing, there's no A1 answer generated by the newly refreshed source, then the pair Q1A1 is considered outdated and is dropped from the project altogether.

However, if the old QnA pairs have been manually edited in the authoring portal, they won't be deleted.

Add new pair

If the content of the URL is updated in such a way that a new QnA pair exists which didn't exist in the old KB, then it's added to the KB. For example, if the service finds that a new answer A2 can be generated, then the QnA pair Q2A2 is inserted into the KB.

Merge pairs

If the answer of a new QnA pair matches the answer of an old QnA pair, the two pairs are merged. The new pair's question is added as an alternate question to the old QnA pair. For example, consider Q3A3 exists in the old source. When you refresh the source, a new QnA pair Q3'A3 is introduced. In that case, the two QnA pairs are merged: Q3' is added to Q3 as an alternate question.

If the old QnA pair has a metadata value, that data is retained and persisted in the newly merged pair.

If the old QnA pair has follow-up prompts associated with it, then the following scenarios may arise:

  • If the prompt attached to the old pair is from the source being refreshed, then it's deleted, and the prompt of the new pair (if any exists) is appended to the newly merged QnA pair.
  • If the prompt attached to the old pair is from a different source, then it's maintained as-is and the prompt from the new question (if any exists) is appended to the newly merged QnA pair.

Merge example

See the following example of a merge operation with differing questions and prompts:

Source iteration Question Answer Prompts
old "What is the new HR policy?" "You may have to choose among the following options:" P1, P2
new "What is the new payroll policy?" "You may have to choose among the following options:" P3, P4

The prompts P1 and P2 come from the original source and are different from prompts P3 and P4 of the new QnA pair. They both have the same answer, You may have to choose among the following options:, but it leads to different prompts. In this case, the resulting QnA pair would look like this:

Question Answer Prompts
"What is the new HR policy?"
(alternate question: "What is the new payroll policy?")
"You may have to choose among the following options:" P3, P4

Duplicate answers scenario

When the original source has two or more QnA pairs with the same answer (as in, Q1A1 and Q2A1), the merge behavior may be more complex.

If these two QnA pairs have individual prompts attached to them (for example, Q1A1+P1 and Q2A1+P2), and the refreshed source content has a new QnA pair generated with the same answer A1 and a new prompt P3 (Q1'A1+P3), then the new question will be added as an alternate question to the original pairs (as described above). But all of the original attached prompts will be overwritten by the new prompt. So the final pair set will look like this:

Question Answer Prompts
Q1
(alternate question: Q1')
A1 P3
Q2
(alternate question: Q1')
A1 P3

Next steps