場景理解Scene understanding

場景理解為混合的現實開發人員提供結構化、高階的環境標記法,其設計目的是為了讓環保感知應用程式的開發變得直覺。Scene understanding provides Mixed Reality developers with a structured, high-level environment representation designed to make developing for environmentally aware applications intuitive. 場景理解藉由結合現有混合現實執行時間的強大功能,例如高度精確但較不具結構化的 空間對應 和全新的 AI 驅動執行時間。Scene understanding does this by combining the power of existing mixed reality runtimes, like the highly accurate but less structured spatial mapping and new AI driven runtimes. 藉由結合這些技術,場景理解會產生3D 環境的標記法,類似于您在 Unity 或 ARKit/ARCore 架構中所使用的環境。By combining these technologies, Scene understanding generates representations of 3D environments that are similar to those you may have used in frameworks such as Unity or ARKit/ARCore. 場景理解進入點的開頭是場景觀察器,由您的應用程式呼叫以計算新場景。The Scene understanding entry point begins with a Scene Observer, which is called by your application to compute a new scene. 現在,此技術可以產生3個相異但相關的物件類別:Today, the technology can generate 3 distinct but related object categories:

  • 簡化的防水環境網格,可推斷平面房間結構而不會雜亂Simplified watertight environment meshes that infer the planar room structure without clutter
  • 用於放置的平面區域,我們稱之為四邊形Plane regions for placement that we call Quads
  • 與我們所呈現的四邊形/防水資料對齊的 空間對應 網格快照集A snapshot of the spatial mapping mesh that aligns with the Quads/Watertight data that we surface

空間對應網格、標示為平面介面、防水網格

本檔的目的是要提供案例總覽,以及說明場景理解和空間對應共用的關聯性。This document is intended to provide a scenario overview and to clarify the relationship that Scene understanding and Spatial mapping share.

使用場景理解進行開發Developing with Scene Understanding

本文僅適用于引入場景瞭解執行時間和概念。This article only serves to introduce the Scene Understanding runtime and concepts. 如果您要尋找如何使用場景理解進行開發的相關檔,您可能會對下列文章感興趣:If you're looking for documentation on how to develop with Scene Understanding, you may be interested in the following articles:

場景理解 SDK 總覽Scene Understanding SDK overview

您可以從範例 GitHub 網站下載場景理解範例應用程式:You can download the Scene Understanding Sample app from the sample GitHub site:

場景理解範例Scene Understanding Sample

如果您沒有裝置,而且想要存取範例場景來嘗試進行場景瞭解,則範例資產資料夾中會有場景:If you don't have a device and wish to access sample scenes to try Scene Understanding out, there are scenes in the sample asset folder:

場景理解範例場景Scene Understanding Sample Scenes

SDKSDK

如果您要尋找以場景理解進行開發的特定詳細資料,請參閱 場景理解 SDK 總覽 檔。If you're looking for specific details on developing with Scene Understanding, see the Scene Understanding SDK overview documentation.

範例Sample

裝置支援Device support

功能Feature HoloLens (第 1 代)HoloLens (1st gen) HoloLens 2HoloLens 2 沉浸式頭戴裝置Immersive headsets
場景理解Scene understanding ✔️✔️

常見使用案例Common usage scenarios

一般空間對應使用案例的圖例:位置、遮蔽、物理和導覽Illustrations of common Spatial mapping usage scenarios: Placement, Occlusion, Physics and Navigation
常見的空間對應使用案例:位置、遮蔽、物理和導覽。Common spatial mapping usage scenarios: placement, occlusion, physics, and navigation.


環境感知應用程式的許多核心案例都可以透過空間對應和場景理解來解決。Many of the core scenarios for environmentally aware applications can be addressed by both Spatial mapping and Scene understanding. 這些核心案例包括位置、遮蔽、物理等等。These core scenarios include placement, occlusion, physics, and so on. 場景理解和空間對應之間的核心差異,在於結構和簡化的最大精確度和延遲的取捨。A core difference between Scene understanding and Spatial mapping is a tradeoff of maximal accuracy and latency to structure and simplicity. 如果您的應用程式需要最低延遲的可能,以及只想要存取的網格三角形,請直接使用空間對應。If your application requires the lowest-latency possible and mesh triangles that only you'll want to access, use Spatial Mapping directly. 如果您要進行較高層級的處理,可以考慮改用場景理解模型,因為它應該會提供您功能的超集合。If you're doing higher-level processing, you may consider switching to the Scene understanding model as it should provide you with a superset of functionality. 您一律可以存取最完整且正確的空間對應資料,因為場景理解會在其標記法中提供空間對應網格的快照。You'll always have access to the most complete and accurate spatial mapping data possible because Scene understanding provides a snapshot of the spatial mapping mesh as part of its representation.

下列各節會回顧新場景理解 SDK 內容中的核心空間對應案例。The following sections revisit the core spatial mapping scenarios in the context of the new Scene understanding SDK.

放置Placement

場景理解提供設計來簡化放置案例的新結構。Scene understanding provides new constructs designed to simplify placement scenarios. 場景可以計算稱為 SceneQuads 的基本專案,其描述可放置全像影像的平面表面。A scene can compute primitives called SceneQuads, which describe flat surfaces on which holograms can be placed. SceneQuads 的設計是圍繞放置,並描述2D 介面,並提供可放置於該介面上的 API。SceneQuads have been designed around placement and describe a 2D surface and provide an API for placement on that surface. 先前,使用三角形網格來進行放置時,必須掃描四個部分的所有區域並進行填滿/後置處理,以找出適合物件放置的位置。Previously, when using the triangle mesh to do placement, one had to scan all areas of the quad and do hole filling/post-processing to identify good locations for object placement. 在四邊形時,這不一定是必要的,因為場景瞭解執行時間會推斷未掃描的四個部分,並使不是表面的區域失效。This isn't always necessary with Quads, as the Scene understanding runtime infers which quad areas weren't scanned, and invalidate areas that aren't part of the surface.

已停用推斷的 SceneQuads,可捕獲掃描區域的放置區域。SceneQuads with inference disabled, capturing placement areas for scanned regions.
映射 #1 -已停用推斷的 SceneQuads,可捕獲掃描區域的放置區域。Image #1 - SceneQuads with inference disabled, capturing placement areas for scanned regions.

已啟用推斷的四邊形,放置不再限於掃描的區域。Quads with inference enabled, placement is no longer limited to scanned areas.
影像 #2 -已啟用推斷的四邊形,放置不再限於掃描的區域。Image #2 - Quads with inference enabled, placement is no longer limited to scanned areas.


如果您的應用程式想要在您的環境的固定結構上放置2D 或3D 影像,則最好是從 空間對應 網格計算這項資訊,以方便放置 SceneQuads。If your application intends to place 2D or 3D holograms on rigid structures of your environment, the simplicity and convenience of SceneQuads for placement is preferable to computing this information from the spatial mapping mesh. 如需本主題的詳細資訊,請參閱 場景理解 SDK 參考For more information on this topic, see the Scene understanding SDK reference

注意 若是相依于空間對應網格的舊版放置程式碼,則可以藉由設定 EnableWorldMesh 設定來計算空間對應網格和 SceneQuads。Note For legacy placement code that depends on the spatial mapping mesh, the spatial mapping mesh can be computed along with SceneQuads by setting EnableWorldMesh setting. 如果場景理解 API 無法滿足您應用程式的延遲需求,我們建議您繼續使用 空間對應 apiIf Scene understanding API doesn't satisfy your application's latency requirements, we recommend you continue to use the Spatial mapping API.

遮蔽Occlusion

空間對應遮蔽 維持環境的即時狀態時,是最不重要的方式。Spatial mapping occlusion remains the least latent way to capture the real-time state of the environment. 雖然這可能有助於在高度動態的場景中提供遮蔽,但您可能會想要考慮針對遮蔽的場景瞭解有幾個原因。Though this may be useful to provide occlusion in highly dynamic scenes, you may wish to consider Scene understanding for occlusion for several reasons. 如果您使用場景理解所產生的空間對應網格,則可以要求不會儲存在本機快取中且無法從認知 Api 使用的空間對應資料。If you use the spatial mapping mesh generated by Scene Understanding, you can request data from spatial mapping that wouldn't be stored in the local cache and isn't available from the perception APIs. 使用遮蔽和防水網格的空間對應可提供額外的價值,特別是完成未掃描的空間結構。Using Spatial Mapping for occlusion alongside watertight meshes will provide extra value, specifically completion of unscanned room structure.

如果您的需求可容忍場景理解的延遲延遲,應用程式開發人員應該考慮使用場景理解防水網格,以及與平面表示一致的空間對應網格。If your requirements can tolerate the increased latency of Scene understanding, application developers should consider using the Scene understanding watertight mesh, and the spatial mapping mesh in unison with planar representations. 這會提供「最棒的」案例,其中簡化的防水遮蔽會使用更精細的 nonplanar 幾何,提供最實際的遮蔽地圖。This would provide a "best of both worlds" scenario where simplified watertight occlusion is married with finer nonplanar geometry providing the most realistic occlusion maps possible.

物理特性Physics

場景理解會產生防水網格,以使用語義分解空間,特別是為了解決空間對應網格所強加的物理限制。Scene understanding generates watertight meshes that decompose space with semantics, specifically to address many limitations to physics that spatial mapping meshes impose. 防水結構可確保一律會達到物理光線轉換,而且語義分解可讓您更輕鬆地產生室內導覽的導覽網格。Watertight structures ensure physics ray casts always hit, and semantic decomposition allows for simpler generation of nav meshes for indoor navigation. 遮蔽一節中所述,使用 EnableSceneObjectMeshes 和 EnableWorldMesh 建立場景將會產生最實際的完整網狀。As described in the section on occlusion, creating a scene with EnableSceneObjectMeshes and EnableWorldMesh will produce the most physically complete mesh possible. 環境網格的防水屬性可防止點擊率測試失敗。The watertight property of the environment mesh prevents hit tests from failing to hit surfaces. 網格資料可確保物理與場景中的所有物件互動,而不只是空間結構。The mesh data will ensure physics are interacting with all objects in the scene and not just the room structure.

以語義類別分解的平面網格是導覽和路徑規劃的理想結構,可簡化 空間對應導覽 總覽中所述的許多問題。Planar meshes decomposed by semantic class are ideal constructs for navigation and path planning, easing many of the issues described in the Spatial mapping navigation overview. 在場景中計算的 SceneMesh 物件是由表面型別取消組成,以確保導覽網格產生僅限於可進行的介面。The SceneMesh objects computed in the scene are de-composed by surface type ensuring that nav-mesh generation is limited to surfaces that can be walked on. 由於地板結構的簡單起見,3d 引擎(例如 Unity)中的動態 nav 網格產生會根據即時需求來實現。Because of the floor structures' simplicity, dynamic nav-mesh generation in 3d engines such as Unity are attainable depending on real-time requirements.

產生精確的 nav 網格目前仍需要後置處理,也就是應用程式仍然必須將阻隔器投影至樓層,以確保流覽不會通過雜亂/資料表等等。Generating accurate nav-meshes currently still requires post-processing, namely applications must still project occluders on to the floor to ensure that navigation doesn't pass through clutter/tables and so on. 達成此目的最準確的方法是投射世界網格資料,如果場景是使用 EnableWorldMesh 旗標來計算的,就會提供此資料。The most accurate way to accomplish this is to project the world mesh data, which is provided if the scene is computed with the EnableWorldMesh flag.

視覺效果Visualization

雖然 空間對應視覺效果 可用於環境的即時意見反應,但在許多情況下,平面和防水物件的簡易性可提供更多的效能或視覺品質。While spatial mapping visualization can be used for real-time feedback of the environment, there are many scenarios where the simplicity of planar and watertight objects provides more performance or visual quality. 如果投射在四邊形或平面防水網格所提供的平面表面上,使用空間對應所描述的陰影投影和接地技術可能更滿意。Shadow projection and grounding techniques that are described using spatial mapping may be more pleasing if projected on the planar surfaces provided by Quads or the planar watertight mesh. 這特別適用于完全預先掃描不適合的環境/案例,因為場景會推斷出,而完整的環境和平面假設會將構件降至最低。This is especially true for environments/scenarios where thorough pre-scanning isn't optimal because the scene will infer, and complete environments and planar assumptions will minimize artifacts.

此外,空間對應所傳回的表面總數會受限於內部空間快取,而場景理解的空間對應網格版本可以存取未快取的空間對應資料。Additionally, the total number of surfaces returned by Spatial Mapping is limited by the internal spatial cache, while Scene understanding's version of the Spatial Mapping mesh can access spatial mapping data that isn't cached. 因此,場景理解更適合用來針對較大的空間來捕捉網格標記法 (例如,大於單一房間) 來進行視覺效果或進一步的網格處理。Because of this, Scene understanding is more suited to capturing mesh representations for larger spaces (for example, larger than a single room) for visualization or further mesh processing. 以 EnableWorldMesh 傳回的世界網格會有一致的詳細資料層級,如果轉譯為框線,可能會產生更美觀的視覺效果。The world mesh returned with EnableWorldMesh will have a consistent level of detail throughout, which may yield a more pleasing visualization if rendered as wireframe.

另請參閱See Also