Search and query an enterprise knowledge base by using Azure OpenAI or Azure AI Search

Azure Blob Storage
Azure Cache for Redis
Azure AI Search
Azure AI services
Azure AI Document Intelligence

This article describes how to use Azure OpenAI Service or Azure AI Search (formerly Azure Cognitive Search) to search documents in your enterprise data and retrieve results to provide a ChatGPT-style question and answer experience. This solution describes two approaches:

  • Embeddings approach: Use the Azure OpenAI embedding model to create vectorized data. Vector search is a feature that significantly increases the semantic relevance of search results.

  • Azure AI Search approach: Use Azure AI Search to search and retrieve relevant text data based on a user query. This service supports full-text search, semantic search, vector search, and hybrid search.

Note

In Azure AI Search, the semantic search and vector search features are currently in public preview.

Architecture: Embedding approach

Diagram that shows the embeddings approach. Download a Visio file of this architecture.

Dataflow

Documents to be ingested can come from various sources, like files on an FTP server, email attachments, or web application attachments. These documents can be ingested to Azure Blob Storage via services like Azure Logic Apps, Azure Functions, or Azure Data Factory. Data Factory is optimal for transferring bulk data.

Embedding creation:

  1. The document is ingested into Blob Storage, and an Azure function is triggered to extract text from the documents.

  2. If documents are in a non-English language and translation is required, an Azure function can call Azure Translator to perform the translation.

  3. If the documents are PDFs or images, an Azure function can call Azure AI Document Intelligence to extract the text. If the document is an Excel, CSV, Word, or text file, python code can be used to extract the text.

  4. The extracted text is then chunked appropriately, and an Azure OpenAI embedding model is used to convert each chunk to embeddings.

  5. These embeddings are persisted to the vector database. This solution uses the Enterprise tier of Azure Cache for Redis, but any vector database can be used.

Query and retrieval:

  1. The user sends a query via a user application.

  2. The Azure OpenAI embedding model is used to convert the query into vector embeddings.

  3. A vector similarity search that uses this query vector in the vector database returns the top k matching content. The matching content to be retrieved can be set according to a threshold that’s defined by a similarity measure, like cosine similarity.

  4. The top k retrieved content and the system prompt are sent to the Azure OpenAI language model, like GPT-3.5 Turbo or GPT-4.

  5. The search results are presented as the answer to the search query that was initiated by the user, or the search results can be used as the grounding data for a multi-turn conversation scenario.

Architecture: Azure AI Search pull approach

Diagram that shows the pull approach. Download a Visio file of this architecture.

Index creation:

  1. Azure AI Search is used to create a search index of the documents in Blob Storage. Azure AI Search supports Blob Storage, so the pull model is used to crawl the content, and the capability is implemented via indexers.

    Note

    Azure AI Search supports other data sources for indexing when using the pull model. Documents can also be indexed from multiple data sources and consolidated into a single index.

  2. If certain scenarios require translation of documents, Azure Translator can be used, which is a feature that's included in the built-in skill.

  3. If the documents are nonsearchable, like scanned PDFs or images, AI can be applied by using built-in or custom skills as skillsets in Azure AI Search. Applying AI over content that isn't full-text searchable is called AI enrichment. Depending on the requirement, Azure AI Document Intelligence can be used as a custom skill to extract text from PDFs or images via document analysis models, prebuilt models, or custom extraction models.

    If AI enrichment is a requirement, pull model (indexers) must be used to load an index.

    If vector fields are added to the index schema, which loads the vector data for indexing, vector search can be enabled by indexing that vector data. Vector data can be generated via Azure OpenAI embeddings.

Query and retrieval:

  1. A user sends a query via a user application.

  2. The query is passed to Azure AI Search via the search documents REST API. The query type can be simple, which is optimal for full-text search, or full, which is for advanced query constructs like regular expressions, fuzzy and wild card search, and proximity search. If the query type is set to semantic, a semantic search is performed on the documents, and the relevant content is retrieved. Azure AI Search also supports vector search and hybrid search, which requires the user query to be converted to vector embeddings.

  3. The retrieved content and the system prompt are sent to the Azure OpenAI language model, like GPT-3.5 Turbo or GPT-4.

  4. The search results are presented as the answer to the search query that was initiated by the user, or the search results can be used as the grounding data for a multi-turn conversation scenario.

Architecture: Azure AI Search push approach

If the data source isn't supported, you can use the push model to upload the data to Azure AI Search.

Diagram that shows the push approach. Download a Visio file of this architecture.

Index creation:

  1. If the document to be ingested must be translated, Azure Translator can be used.
  2. If the document is in a nonsearchable format, like a PDF or image, Azure AI Document Intelligence can be used to extract text.
  3. The extracted text can be vectorized via Azure OpenAI embeddings vector search, and the data can be pushed to an Azure AI Search index via a Rest API or an Azure SDK.

Query and retrieval:

The query and retrieval in this approach is the same as the pull approach earlier in this article.

Components

  • Azure OpenAI provides REST API access to Azure OpenAI's language models including the GPT-3, Codex, and the embedding model series for content generation, summarization, semantic search, and natural language-to-code translation. Access the service by using a REST API, Python SDK, or the web-based interface in the Azure OpenAI Studio.

  • Azure AI Document Intelligence is an Azure AI service. It offers document analysis capabilities to extract printed and handwritten text, tables, and key-value pairs. Azure AI Document Intelligence provides prebuilt models that can extract data from invoices, documents, receipts, ID cards, and business cards. You can also use it to train and deploy custom models by using a custom template form model or a custom neural document model.

  • Document Intelligence Studio provides a UI for exploring Azure AI Document Intelligence features and models, and for building, tagging, training, and deploying custom models.

  • Azure AI Search is a cloud service that provides infrastructure, APIs, and tools for searching. Use Azure AI Search to build search experiences over private disparate content in web, mobile, and enterprise applications.

  • Blob Storage is the object storage solution for raw files in this scenario. Blob Storage supports libraries for various languages, such as .NET, Node.js, and Python. Applications can access files in Blob Storage via HTTP or HTTPS. Blob Storage has hot, cool, and archive access tiers to support cost optimization for storing large amounts of data.

  • The Enterprise tier of Azure Cache for Redis provides managed Redis Enterprise modules, like RediSearch, RedisBloom, RedisTimeSeries, and RedisJSON. Vector fields allow vector similarity search, which supports real-time vector indexing (brute force algorithm (FLAT) and hierarchical navigable small world algorithm (HNSW)), real-time vector updates, and k-nearest neighbor search. Azure Cache for Redis brings a critical low-latency and high-throughput data storage solution to modern applications.

Alternatives

Depending on your scenario, you can add the following workflows.

Scenario details

Manual processing is increasingly time-consuming, error-prone, and resource-intensive due to the sheer volume of documents. Organizations that handle huge volumes of documents, largely unstructured data of different formats like PDF, Excel, CSV, Word, PowerPoint, and image formats, face a significant challenge processing scanned and handwritten documents and forms from their customers.

These documents and forms contain critical information, such as personal details, medical history, and damage assessment reports, which must be accurately extracted and processed.

Organizations often already have their own knowledge base of information, which can be used for answering questions with the most appropriate answer. You can use the services and pipelines described in these solutions to create a source for search mechanisms of documents.

Potential use cases

This solution provides value to organizations in industries like pharmaceutical companies and financial services. It applies to any company that has a large number of documents with embedded information. This AI-powered end-to-end search solution can be used to extract meaningful information from the documents based on the user query to provide a ChatGPT-style question and answer experience.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal authors:

  • Dixit Arora | Senior Customer Engineer, ISV DN CoE
  • Jyotsna Ravi | Principal Customer Engineer, ISV DN CoE

To see non-public LinkedIn profiles, sign in to LinkedIn.

Next steps