Scoring comparison among vector search services

mangospice 0 Reputation points
2023-08-22T08:01:54.7166667+00:00

I would like to know the accuracy/scores of results when using various vector search services on Azure. I am particularly interested in Azure Cognitive Search, Azure Cache for Redis, Azure Cosmos DB, and Azure PostgreSQL. Simply speaking, if I create an AI based app FAQ service using a vector search, which one of the Azure services would work best? Besides the accuracy of the result, I would appreciate any comments on what makes each service unique. Thank you!

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
764 questions
Azure Cache for Redis
Azure Cache for Redis
An Azure service that provides access to a secure, dedicated Redis cache, managed by Microsoft.
222 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,475 questions
Azure Database for PostgreSQL
{count} votes

1 answer

Sort by: Most helpful
  1. Kyle Teegarden 1 Reputation point Microsoft Employee
    2023-08-22T23:40:04.6533333+00:00

    Hi @mangospice , this is an excellent question! But there likely isn't an easy answer. First of all, the embeddings model will be the most important factor in the quality of your results. There are many different models you can use, but the text-embedding-ada-002 (Version 2) model is the best right now through Azure OpenAI.

    In terms of the service you use to store and compare the embeddings vectors, they offer similar capabilities. I'm most familiar with Redis, so you'll have to forgive any errors on my part, but here are some things to consider:

    • Index type. Using a K-Nearest Neighbor (KNN) or FLAT index method will make the results more precise, but also make searches slower. Looks like Redis and Postgres support this method. Each service also supports an Approximate Nearest Neighbor (ANN) method that trades off precision for speed. Redis and Azure Cognitive Search use Hierarchical Navigable Small Worlds (HNSW) while Cosmos and Postgres use ivfflat.
    • Comparison method. Looks like all four services support Cosine, Euclidean and Inner Product methods.
    • Hybrid Search. Often, you'll want to filter your results based on other parameters based on document metadata or other characteristics. I believe you can do this on any of the services, although the feature set will vary. Redis and Azure Cognitive Search have extremely rich functionality that can be used for hybrid searches.
    • Maturity. Vector search in Redis is GA and has been around for years. Vector capabilities are now GA in Postgres and Cosmos. And vector search is in preview on Azure Cognitive Search.
    • Performance. I don't have any benchmarks here, but performance will likely vary between the services.
    • Cost. Redis and Postgres bill on a per instance/per hour basis, while Cosmos has multiple billing methods based on consumption. Azure Cognitive Search bills based on scale units.

    Ultimately, these are four excellent services and it probably comes down to your specific use-case and what you're already familiar with. For example, if you're already using Azure CosmosDB for your app, it would probably make sense to just utilize the vector capabilities there. I think Redis is a good choice because it has rich hybrid search functionality and it's extremely fast due to running in-memory. Plus, since vector capabilities have been around in Redis for a relatively long time, there are plenty of tutorials and documentation already online. The downside to Redis is that it can be more expensive than the other options because this feature is only available on the Enterprise tier.

    We'll have a full tutorial up soon on our docs page on generating and search through embeddings using Azure Cache for Redis. In the meantime, here are some helpful resources:

    0 comments No comments