您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

Azure 认知搜索-常见问题(FAQ)Azure Cognitive Search - frequently asked questions (FAQ)

查找有关与 Azure 认知搜索相关的概念、代码和方案的常见问题的解答。Find answers to commonly asked questions about concepts, code, and scenarios related to Azure Cognitive Search.


Azure 认知搜索如何与我的 DBMS 中的全文搜索不同?How is Azure Cognitive Search different from full text search in my DBMS?

Azure 认知搜索支持多个数据源、针对多种语言的语言分析对感兴趣的异常数据输入进行自定义分析、通过评分配置文件搜索排名控件以及 typeahead、命中突出显示和分面导航等用户体验功能。Azure Cognitive Search supports multiple data sources, linguistic analysis for many languages, custom analysis for interesting and unusual data inputs, search rank controls through scoring profiles, and user-experience features such as typeahead, hit highlighting, and faceted navigation. 它还包含其他功能,如同义词和丰富的查询语法,但通常来说,这些不属于区别性功能。It also includes other features, such as synonyms and rich query syntax, but those are generally not differentiating features.

Azure 认知搜索与 Elasticsearch 之间的区别是什么?What is the difference between Azure Cognitive Search and Elasticsearch?

比较搜索技术时,客户通常会要求详细了解 Azure 认知搜索如何与 Elasticsearch 进行比较。When comparing search technologies, customers frequently ask for specifics on how Azure Cognitive Search compares with Elasticsearch. 为其搜索应用程序项目选择 Azure 认知搜索超过 Elasticsearch 的客户通常会这样做,因为我们可以更轻松地执行关键任务,或者需要与其他 Microsoft 技术建立内置集成:Customers who choose Azure Cognitive Search over Elasticsearch for their search application projects typically do so because we've made a key task easier or they need the built-in integration with other Microsoft technologies:

  • Azure 认知搜索是一项完全托管的云服务,具有99.9% 的服务级别协议(SLA),预配具有足够的冗余(2个副本用于读取访问,三个副本用于读写)。Azure Cognitive Search is a fully managed cloud service with 99.9% service level agreements (SLA) when provisioned with sufficient redundancy (2 replicas for read access, three replicas for read-write).
  • Microsoft 的自然语言处理器提供领先的语言分析。Microsoft's Natural language processors offer leading edge linguistic analysis.
  • Azure 认知搜索索引器可以爬行各种 Azure 数据源以进行初始和增量索引。Azure Cognitive Search indexers can crawl a variety of Azure data sources for initial and incremental indexing.
  • 如果需要对查询或索引卷的波动快速响应,可使用 Azure 门户中的滑块控制,或运行 PowerShell 脚本直接绕过分片管理。If you need rapid response to fluctuations in query or indexing volumes, you can use slider controls in the Azure portal, or run a PowerShell script, bypassing shard management directly.
  • 计分和优化功能提供搜索引擎无法单独提供的影响搜索优先级评分的方法。Scoring and tuning features provide the means for influencing search rank scores beyond what the search engine alone can provide.

能否暂停 Azure 认知搜索服务并停止计费?Can I pause Azure Cognitive Search service and stop billing?

无法暂停服务。You cannot pause the service. 创建服务时,计算和存储资源是分配给用户单独使用的。Computational and storage resources are allocated for your exclusive use when the service is created. 无法按需发布和回收此类资源。It's not possible to release and reclaim those resources on-demand.

索引操作Indexing Operations

移动、备份和还原索引或索引快照?Move, backup, and restore indexes or index snapshots?

在开发阶段,你可能想要在搜索服务之间移动索引。During the development phase, you may want to move your index between search services. 例如,你可以使用 "基本" 或 "免费" 定价层来开发索引,然后将其移动到标准或更高的层以供生产使用。For example, you may use a Basic or Free pricing tier to develop your index, and then want to move it to the Standard or higher tier for production use.

或者,你可能想要将索引快照备份到稍后可用于还原它的文件。Or, you may want to backup an index snapshot to files that can be used to restore it later.

可以通过此Azure 认知搜索 .net 示例存储库中的索引备份-还原示例代码来执行所有这些操作。You can do all these things with the index-backup-restore sample code in this Azure Cognitive Search .NET sample repo.

还可以使用 Azure 认知搜索 REST API 随时获取索引定义You can also get an index definition at any time using the Azure Cognitive Search REST API.

Azure 门户中目前没有内置索引提取、快照或备份-还原功能。There is currently no built-in index extraction, snapshot, or backup-restore feature in the Azure portal. 但是,我们正在考虑在未来的版本中添加备份和还原功能。However, we are considering adding the backup and restore functionality in a future release. 如果希望为此功能提供支持,请对用户语音投票。If you want show your support for this feature, cast a vote on User Voice.

删除后能否还原索引或服务?Can I restore my index or service once it is deleted?

不可以。如果删除 Azure 认知搜索索引或服务,则无法恢复该索引或服务。No, if you delete an Azure Cognitive Search index or service, it cannot be recovered. 删除 Azure 认知搜索服务时,将永久删除该服务中的所有索引。When you delete an Azure Cognitive Search service, all indexes in the service are deleted permanently. 如果删除包含一个或多个 Azure 认知搜索服务的 Azure 资源组,则所有服务都将被永久删除。If you delete an Azure resource group that contains one or more Azure Cognitive Search services, all services are deleted permanently.

重新创建索引、索引器、数据源和技能集等资源需要从代码重新创建它们。Recreating resources such as indexes, indexers, data sources, and skillsets requires that you recreate them from code.

若要重新创建索引,必须对外部源中的数据重新编制索引。To recreate an index, you must re-index data from external sources. 出于此原因,建议你在另一个数据存储(如 Azure SQL 数据库或 Cosmos DB)中保留原始数据的主副本或备份。For this reason, it is recommended that you retain a master copy or backup of the original data in another data store, such as Azure SQL Database or Cosmos DB.

作为替代方法,你可以使用此Azure 认知搜索 .net 示例存储库中的索引备份-还原示例代码,将索引定义和索引快照备份到一系列 JSON 文件。As an alternative, you can use the index-backup-restore sample code in this Azure Cognitive Search .NET sample repo to back up an index definition and index snapshot to a series of JSON files. 稍后,如果需要,可以使用工具和文件还原索引。Later, you can use the tool and files to restore the index, if needed.

能否从 SQL 数据库副本(适用于 Azure SQL 数据库索引器)进行索引?Can I index from SQL database replicas (Applies to Azure SQL Database indexers)

从头开始创建索引时,对使用主要或次要副本作为数据源没有任何限制。There are no restrictions on the use of primary or secondary replicas as a data source when building an index from scratch. 然而,使用增量更新(基于已更改的记录)刷新索引时需要主要副本。However, refreshing an index with incremental updates (based on changed records) requires the primary replica. 此需求来自于 SQL 数据库,它仅确保主要副本上的更改跟踪。This requirement comes from SQL Database, which guarantees change tracking on primary replicas only. 如果尝试为索引刷新工作负荷使用次要副本,则无法保证获得所有数据。If you try using secondary replicas for an index refresh workload, there is no guarantee you get all of the data.

搜索操作Search Operations

能否跨多个索引进行搜索?Can I search across multiple indexes?

不,不支持此操作。No, this operation is not supported. 搜索始终限制在单一索引内。Search is always scoped to a single index.

能否根据用户身份限制搜索索引访问?Can I restrict search index access by user identity?

可以使用 筛选器实现安全筛选器search.in()You can implement security filters with search.in() filter. 使用 Azure Active Directory(AAD) 等标识管理服务可很好地编写筛选器,并基于定义的用户组成员身份裁剪搜索结果。The filter composes well with identity management services like Azure Active Directory(AAD) to trim search results based on defined user group membership.

为什么确定有效的术语没有匹配项?Why are there zero matches on terms I know to be valid?

最常见的情况是不了解每种查询类型支持不同的搜索行为和语言分析级别。The most common case is not knowing that each query type supports different search behaviors and levels of linguistic analyses. 全文搜索是最主要的工作负载,它包括将字词分解为根窗体的语言分析阶段。Full text search, which is the predominant workload, includes a language analysis phase that breaks down terms to root forms. 查询分析的这种特性拓宽了可能的匹配范围,因为标记化的术语能够匹配更多变体。This aspect of query parsing casts a broader net over possible matches, because the tokenized term matches a greater number of variants.

但是,通配符查询、模糊查询和正则表达式查询的分析方法与常规词或短语查询不同,并且当查询与单词在搜索索引中的分析形式不匹配时可能会导致再次调用性能不佳。Wildcard, fuzzy and regex queries, however, aren't analyzed like regular term or phrase queries and can lead to poor recall if the query does not match the analyzed form of the word in the search index. 有关查询分析和分析的详细信息,请参阅查询体系结构For more information on query parsing and analysis, see query architecture.

通配符搜索速度较慢。My wildcard searches are slow.

大多数通配符搜索查询(如前缀、模糊和正则表达式)会使用搜索索引中匹配的词在内部重写。Most wildcard search queries, like prefix, fuzzy and regex, are rewritten internally with matching terms in the search index. 这一扫描搜索索引的额外处理会增加延迟时间。This extra processing of scanning the search index adds to latency. 此外,广泛搜索查询(例如 a*),可能会使用许多词重写,因此速度可能会非常慢。Further, broad search queries, like a* for example, that are likely to be rewritten with many terms can be very slow. 对于高性能通配符搜索,请考虑定义自定义分析器For performant wildcard searches, consider defining a custom analyzer.

为什么每个搜索词的搜索优先级是一个常数,或都等于 1.0?Why is the search rank a constant or equal score of 1.0 for every hit?

默认情况下,根据匹配术语的统计属性对搜索结果打分,在结果集中从高到低排序。By default, search results are scored based on the statistical properties of matching terms, and ordered high to low in the result set. 但某些查询类型(通配符、前缀、正则表达式)始终会给文档总评分贡献一个常数分数。However, some query types (wildcard, prefix, regex) always contribute a constant score to the overall document score. 这是设计的行为。This behavior is by design. Azure 认知搜索强制实施常量评分,以允许在结果中包含通过查询扩展找到的匹配项,而不会影响排名。Azure Cognitive Search imposes a constant score to allow matches found through query expansion to be included in the results, without affecting the ranking.

例如,假设在通配符搜索中输入“tour*”,会产生匹配结果“tours”、“tourettes”和“tourmaline”。For example, suppose an input of "tour*" in a wildcard search produces matches on “tours”, “tourettes”, and “tourmaline”. 由于这些结果的性质,我们无法合理推断出哪些字词的相关性高于其他字词。Given the nature of these results, there is no way to reasonably infer which terms are more valuable than others. 因此,在为通配符、前缀和正则表达式类型的查询结果评分时,我们会忽略字词频率。For this reason, we ignore term frequencies when scoring results in queries of types wildcard, prefix, and regex. 建立在不完整输入上的搜索结果获得一个常数分数,以避免可能的意外匹配偏差。Search results based on a partial input are given a constant score to avoid bias towards potentially unexpected matches.

设计模式Design patterns

涉及到支持相同索引中的不同区域设置(语言)时,大多数客户会选择专用字段而非集合。Most customers choose dedicated fields over a collection when it comes to supporting different locales (languages) in the same index. 通过区域设置特定字段可分配适当的分析器。Locale-specific fields make it possible to assign an appropriate analyzer. 例如,将 Microsoft 法语分析器分配给包含法语字符串的字段。For example, assigning the Microsoft French Analyzer to a field containing French strings. 这样也简化了筛选过程。It also simplifies filtering. 如果已知在 fr-fr 页面上启动了一个查询,则可将搜索结果限制为该字段。If you know a query is initiated on a fr-fr page, you could limit search results to this field. 或者,创建一个计分概要文件以增加该字段的相关性。Or, create a scoring profile to give the field more relative weight. Azure 认知搜索支持50 多语言分析器,可从中进行选择。Azure Cognitive Search supports over 50 language analyzers to choose from.

后续步骤Next steps

问题是否与缺少功能相关?Is your question about a missing feature or functionality? 请在 User Voice 网站上请求该功能。Request the feature on the User Voice web site.

另请参阅See also

StackOverflow: Azure 认知搜索 StackOverflow: Azure Cognitive Search
Azure 中全文搜索的工作原理认知搜索How full text search works in Azure Cognitive Search
什么是 Azure 认知搜索?What is Azure Cognitive Search?