瞭解混合現實的效能Understanding performance for mixed reality

本文是瞭解您的混合現實應用程式效能重要性的簡介。This article is an introduction to understanding the significance of performance for your Mixed Reality app. 如果您的應用程式不是以最佳的畫面播放速率執行,則使用者體驗可能會大幅降低。User experience can be greatly degraded if your application doesn't run at optimal frame rate. 全像是不穩定的,且環境的標頭追蹤將不穩定,而導致使用者體驗不佳。Holograms will appear unstable and head tracking of the environment will be inaccurate, leading to a poor experience for the user. 您必須將效能視為混合現實開發的第一個類別功能,而不是波蘭文工作。Performance must be considered a first class feature for mixed reality development and not a polish task.

以下列出每個目標平臺的效能幀效能值。The performant framerate values for each target platform are listed below.

平台Platform 目標畫面播放速率Target Frame Rate
HoloLensHoloLens 60 FPS60 FPS
Windows Mixed Reality Ultra 電腦Windows Mixed Reality Ultra PCs 90 FPS90 FPS
Windows Mixed Reality 電腦Windows Mixed Reality PCs 60 FPS60 FPS

下列架構概述達到目標畫面播放速率的最佳作法。The framework below outlines best practices for hitting target frame rates. 建議您閱讀 unity 文章的效能建議 ,以取得在 unity 環境中測量及改善畫面播放速率的秘訣。We recommend reading the performance recommendations for Unity article for tips on measuring and improving framerate in the Unity environment.

瞭解效能瓶頸Understanding performance bottlenecks

如果您的應用程式有表現不佳的畫面播放速率,則第一個步驟是分析並瞭解應用程式的計算密集型位置。If your app has an underperforming framerate, the first step is to analyze and understand where your application is computationally intensive. 有兩個主要的處理器負責轉譯您場景的工作: CPU 和 GPU,每個都處理混合現實應用程式的不同層面。There are two primary processors responsible for the work to render your scene: the CPU and the GPU, each handling different aspects of your Mixed Reality app. 可能發生瓶頸的三個主要位置如下:The three key places where bottlenecks may occur are:

  1. 應用程式執行緒-CPU - 負責您的應用程式邏輯,包括處理輸入、動畫、物理和其他應用程式邏輯。App Thread - CPU - Responsible for your app logic, including processing input, animations, physics, and other app logic.
  2. 將執行緒 CPU 轉譯為 gpu ,負責將您的繪製呼叫提交至 gpu。Render Thread - CPU to GPU - Responsible for submitting your draw calls to the GPU. 當您的應用程式想要轉譯物件(例如 cube 或模型)時,此執行緒會將要求傳送至 GPU 以進行作業。When your app wants to render an object such as a cube or model, this thread sends a request to the GPU to do the operations.
  3. GPU -最常用來處理應用程式的圖形管線,以將3d 資料 (模型、材質等) 轉換成圖元。GPU - Most commonly handles the graphics pipeline of your application to transform 3D data (models, textures, and so on) into pixels. 它最終會產生要提交到您裝置畫面的2D 影像。It ultimately produces a 2D image to submit to your device's screen.

框架的存留期

通常,HoloLens 應用程式將會有 GPU 界限,但不一定會。Generally, HoloLens applications will be GPU bound, but not always. 您可以使用下列工具和技術,來瞭解您的特定應用程式瓶頸的位置。Use the tools and techniques below to understand where your particular app is bottlenecked.

如何分析您的應用程式How to analyze your application

有許多工具可讓您瞭解混合現實應用程式中的效能設定檔和潛在的瓶頸。There are many tools that allow you to understand the performance profile and potential bottlenecks in your mixed reality application.

以下是一些常見的工具,可協助您收集應用程式的深入分析資訊:Below are some common tools to help you gather deep profiling information for your application:

如何在任何環境中進行分析How to profile in any environment

判斷您的應用程式是否為 GPU 或受 CPU 限制的其中一種方式,是降低轉譯目標輸出的解析度。One way to determine if your app is GPU or CPU bound is to lower the resolution of the render target output. 藉由減少要計算的圖元數,您將會降低 GPU 負載。By reducing the number of pixels to calculate, you'll reduce your GPU load. 裝置會轉譯成較小的材質,然後轉譯為顯示您最終影像的向上取樣。The device will render to a smaller texture, then up-sample to display your final image.

減少轉譯解析度之後,如果:After lowering rendering resolution, if:

  1. 應用程式的畫面播放速率增加時,您可能 受到 GPU 的限制Application framerate increases, then you're likely GPU Bound
  2. 應用程式 的幀 速率沒有變更,因此您可能會受 限於 CPUApplication framerate unchanged, then you're likely CPU Bound

注意

Unity 能讓您在執行時間透過 XRSettings renderViewportScale 屬性,輕鬆修改應用程式的轉譯目標解析度。Unity provides the ability to easily modify the render target resolution of your application at runtime through the XRSettings.renderViewportScale property. 裝置上顯示的最終映射具有固定的解析度。The final image presented on device has a fixed resolution. 平臺會取樣較低解析度的輸出,以建立較高的解析度影像,以便在顯示時呈現。The platform will sample the lower resolution output to build a higher resolution image for rendering on displays.

UnityEngine.XR.XRSettings.renderScale = 0.7f;

如何改進您的應用程式How to improve your application

CPU 效能建議CPU performance recommendations

一般來說,在 CPU 上的混合現實應用程式中,大部分的工作都牽涉到執行場景的「模擬」,並處理您的應用程式邏輯。Generally, most work in a mixed reality application on the CPU involves doing the "simulation" of the scene and processing your application logic. 以下是優化的目的地區域:The following areas are targeted for optimization:

  • 動畫Animations
  • 物理特性Physics
  • 記憶體配置Memory allocations
  • 複雜的演算法 (即Complex algorithms (i.e 反向運動,路徑-尋找) inverse kinematics, path-finding)

GPU 效能建議GPU performance recommendations

瞭解頻寬與填滿率Understanding bandwidth vs. fill rate

在 GPU 上轉譯框架時,應用程式會受到記憶體頻寬或填滿率的限制。When rendering a frame on the GPU, an application is either bound by memory bandwidth or fill rate.

  • 記憶體頻寬 是 GPU 可從記憶體進行的讀取和寫入速率Memory bandwidth is the rate of reads and writes the GPU can do from memory
    • 若要找出頻寬限制,請減少材質品質,並檢查畫面播放速率是否已改善。To identify bandwidth limitations, reduce texture quality and check if the framerate has improved.
    • 在 Unity 中,變更 編輯 > 專案設定 > **[品質] 設定** 中的材質品質。In Unity, change Texture Quality in Edit > Project Settings > Quality Settings.
  • 填滿率 是指 GPU 每秒可繪製的圖元。Fill rate refers to the pixels that can be drawn per second by the GPU.
    • 若要識別填滿速率限制,請降低顯示器解析度,並檢查是否已改善幀。To identify fill rate limitations, lower the display resolution and check if framerate improved.
    • 在 Unity 中,使用 XRSettings. renderViewportScale 屬性In Unity, use the XRSettings.renderViewportScale property

記憶體頻寬通常牽涉到下列其中一項的優化:Memory bandwidth generally involves optimizations to either:

  1. 較低材質解析度Lower texture resolutions
  2. 使用較少的材質 (法線、反射等等) Use fewer textures (normals, specular, and so on)

填滿率著重于減少需要針對最終轉譯圖元計算的作業數目,包括:Fill rate is focused on reducing the number of operations that need to be computed for a final rendered pixel, including:

  1. 要呈現/處理的物件數目Number of objects to render/process
  2. 每個著色器的作業數目Number of operations per shader
  3. 最終結果的 GPU 階段數 (幾何著色器、後置處理效果等等) Number of GPU stages to final result (geometry shaders, post-processing effects, and so on)
  4. 轉譯 (顯示解析度) 的圖元數Number of pixels to render (display resolution)

減少多邊形計數Reduce polygon count

較高的多邊形計數會導致 GPU 的更多作業,因此減少場景中的 多邊形數目 可減少轉譯時間。Higher polygon counts result in more operations for the GPU, so reducing the number of polygons in your scene reduces the render time. 還有其他因素會讓幾何的陰影變得相當昂貴,但是多邊形計數是最簡單的度量,可判斷呈現場景所需的工作量。There are other factors that make shading the geometry expensive, but polygon count is the simplest metric to determine how much work it will take to render a scene.

限制過度繪製Limit overdraw

當轉譯多個物件但未在螢幕上顯示時,如果遮蔽物件隱藏了多個物件,就會發生高過度繪製。High overdraw occurs when multiple objects are rendered but not shown on screen as they're hidden by an occluding object. 想像一下一下有物件背後的牆。Imagine looking at a wall that has objects behind it. 所有幾何都會處理以進行轉譯,但只需要轉譯不透明的牆,這會導致不必要的作業。All of the geometry would be processed for rendering, but only the opaque wall needs to be rendered, which results in unnecessary operations.

著色器Shaders

著色器是在 GPU 上執行的小型程式,且會在轉譯時執行兩個重要的步驟:Shaders are small programs that run on the GPU and do two important steps in rendering:

  1. 判斷應繪製哪些頂點,以及它們在螢幕空間 (頂點著色器的位置) Determining which vertices should be drawn and where they are in screen space (the Vertex shader)
    • 每個網格的頂點著色器都會針對每個頂點執行。The Vertex shader is executed per vertex for every mesh.
  2. 判斷圖元著色器 (每個圖元的色彩) Determining the color of each pixel (the Pixel shader)
    • 圖元著色器是依圖元執行,並由幾何轉譯為目標呈現材質。The Pixel shader is executed per pixel and rendered by the geometry to the target render texture.

一般而言,著色器會進行許多轉換和光源計算。Typically, shaders do many transformations and lighting calculations. 雖然複雜的光源模型、陰影和其他作業可以產生絕佳的結果,但它們也有價格。Although complex lighting models, shadows, and other operations can generate fantastic results, they also come with a price. 減少著色器中計算的作業數目,可以大幅減少每個畫面的 GPU 所需的工作量。Reducing the number of operations computed in shaders can greatly reduce the work needed for the GPU per frame.

著色器編碼建議Shader coding recommendations
  • 盡可能使用雙線性篩選Use bilinear filtering, whenever possible
  • 重新排列運算式以使用 MAD 內建函式,同時執行乘法和 addRearrange expressions to use MAD intrinsics to do a multiply and an add at the same time
  • 在 CPU 上盡可能 Precalculate 並以常數形式傳遞至材質Precalculate as much as possible on the CPU and pass as constants to the material
  • 偏好將作業從圖元著色器移至頂點著色器Favor moving operations from the pixel shader to the vertex shader
    • 一般來說,頂點的數目小於圖元數 (720p 是921600圖元、1080p 是2073600圖元,依此類推) Generally, the number of vertices is much smaller than the number of pixels (720p is 921,600 pixels, 1080p is 2,073,600 pixels, and so on)

移除 GPU 階段Remove GPU stages

後續處理的效果可能很昂貴,而且會提高應用程式的填滿率,包括如 MSAA 的消除鋸齒技術。Post-processing effects can be expensive and increase the fill rate of your application, including anti-aliasing techniques like MSAA. 在 HoloLens 上,建議您避免使用這些技術和其他著色器階段,例如幾何、輪廓和計算著色器。On HoloLens, we recommended avoiding these techniques and additional shader stages such as geometry, hull, and compute shaders.

記憶體建議Memory recommendations

過度的記憶體配置和解除配置作業可能會導致效能不一致、凍結的框架和其他不利行為。Excessive memory allocation and deallocation operations can result in inconsistent performance, frozen frames, and other detrimental behavior. 在 Unity 中進行開發時,請務必瞭解記憶體考慮,因為記憶體管理是由垃圾收集行程所控制。It's especially important to understand memory considerations when developing in Unity, since memory management is controlled by the garbage collector.

物件集區Object pooling

物件共用是一種常用的技巧,可降低持續配置和物件取消配置的成本。Object pooling is a popular technique to reduce the cost of continuous allocations and deallocations of objects. 這是藉由配置相同物件的大型集區,並重複使用此集區中非使用中的可用實例來完成,而不是在一段時間內不斷產生和終結物件。This is done by allocating a large pool of identical objects and reusing inactive, available instances from this pool instead of constantly spawning and destroying objects over time. 物件集區非常適合在應用程式期間有變數存留期的重複使用元件。Object pools are great for reuseable components that have variable lifetime during an app.

另請參閱See also