Performance recommendations for Unity

This article builds on the discussion outlined in performance recommendations for mixed reality but focuses on learnings specific to the Unity engine environment.

It is also highly advisable that developers review the recommended environment settings for Unity article. This article has content with some of the most important scene configurations in regards to building performant Mixed Reality apps. Some of these recommended settings are highlighted below as well.

How to profile with Unity

Unity provides the Unity Profiler built-in which is a great resource to gather valuable performance insights for your particular app. Although one can run the profiler in-editor, these metrics do not represent the true runtime environment and thus, results from this should be used cautiously. It is recommended to remotely profile your application while running on device for most accurate and actionable insights. Further, Unity's Frame Debugger is also a very powerful and insight tool to utilize.

Unity provides great documentation for:

  1. How to connect the Unity profiler to UWP applications remotely
  2. How to effectively diagnose performance problems with the Unity Profiler


With the Unity Profiler connected and after adding the GPU profiler (see Add Profiler in top right corner), one can see how much time is being spent on the CPU & GPU respectively in the middle of the profiler. This allows the developer to get a quick approximation if their application is CPU or GPU bounded.

Unity CPU vs GPU

CPU performance recommendations

The content below covers more in-depth performance practices, especially targeted for Unity & C# development.

Cache references

It is best practice to cache references to all relevant components and GameObjects at initialization. This is because repeating function calls such as GetComponent<T>() are significantly more expensive relative to the memory cost to store a pointer. This also applies to to the very, regularly used Camera.main. Camera.main actually just uses FindGameObjectsWithTag() underneath which expensively searches your scene graph for a camera object with the "MainCamera" tag.

using UnityEngine;
using System.Collections;

public class ExampleClass : MonoBehaviour
    private Camera cam;
    private CustomComponent comp;

    void Start() 
        cam = Camera.main;
        comp = GetComponent<CustomComponent>();

    void Update()
        // Good
        this.transform.position = cam.transform.position + cam.transform.forward * 10.0f;

        // Bad
        this.transform.position = Camera.main.transform.position + Camera.main.transform.forward * 10.0f;

        // Good

        // Bad


Avoid GetComponent(string)
When using GetComponent(), there are a handful of different overloads. It is important to always use the Type based implementations and never the string-based searching overload. Searching by string in your scene is significantly more costly than searching by Type.
(Good) Component GetComponent(Type type)
(Good) T GetComponent<T>()
(Bad) Component GetComponent(string)>

Avoid expensive operations

  1. Avoid use of LINQ

    Although LINQ can be very clean and easy to read and write, it generally requires much more computation and particularly more memory allocation than writing the algorithm out manually.

    // Example Code
    using System.Linq;
    List<int> data = new List<int>();
    data.Any(x => x > 10);
    var result = from x in data
                 where x > 10
                 select x;
  2. Common Unity APIs

    Certain Unity APIs, although useful, can be very expensive to execute. Most of these involve searching your entire scene graph for some matching list of GameObjects. These operations can generally be avoided by caching references or implementing a manager component for the GameObjects in question to track the references at runtime.



SendMessage() and BroadcastMessage() should be eliminated at all costs. These functions can be on the order of 1000x slower than direct function calls.

  1. Beware of boxing

    Boxing is a core concept of the C# language and runtime. It is the process of wrapping value-typed variables such as char, int, bool, etc. into reference-typed variables. When a value-typed variable is "boxed", it is wrapped inside of a System.Object which is stored on the managed heap. Thus, memory is allocated and eventually when disposed must be processed by the garbage collector. These allocations and deallocations incur a performance cost and in many scenarios are unnecessary or can be easily replaced by a less expensive alternative.

Repeating code paths

Any repeating Unity callback functions (i.e Update) that are executed many times per second and/or frame should be written very carefully. Any expensive operations here will have huge and consistent impact on performance.

  1. Empty callback functions

    Although the code below may seem innocent to leave in your application, especially since every Unity script auto-initializes with this code block, these empty callbacks can actually become very expensive. Unity operates back and forth over an unmanaged/managed code boundary, between UnityEngine code and your application code. Context switching over this bridge is fairly expensive even if there is nothing to execute. This becomes especially problematic if your app has 100's of GameObjects with components that have empty repeating Unity callbacks.

    void Update()


Update() is the most common manifestation of this performance issue but other repeating Unity callbacks such as the following can be equally as bad if not worse: FixedUpdate(), LateUpdate(), OnPostRender", OnPreRender(), OnRenderImage(), etc.

  1. Operations to favor running once per frame

    The following Unity APIs are common operations for many Holographic Apps. Although not always possible, the results from these functions can very commonly be computed once and the results re-utilized across the application for a given frame.

    a) Generally it is good practice to have a dedicated Singleton class or service to handle your gaze Raycast into the scene and then re-use this result in all other scene components, instead of making repeated and essentially identical Raycast operations by each component. Of course, some applications may require raycasts from different origins or against different LayerMasks.


    b) Avoid GetComponent() operations in repeated Unity callbacks like Update() by caching references in Start() or Awake()


    c) It is good practice to instantiate all objects, if possible, at initialization and use object pooling to recycle and re-use GameObjects throughout runtime of your application

  2. Avoid interfaces and virtual constructs

    Invoking function calls through interfaces vs direct objects or calling virtual functions can often times be much more expensive than utilizing direct constructs or direct function calls. If the virtual function or interface is unnecessary, then it should be removed. However, the performance hit for these approaches are generally worth the trade-off if utilizing them simplifies development collaboration, code readability, and code maintainability.

  3. Avoid passing structs by value

    Unlike classes, structs are value-types and when passed directly to a function, their contents are copied into a newly created instance. This copy adds CPU cost as well as additional memory on the stack. For small structs, the effect is usually very minimal and thus acceptable. However, for functions repeatedly invoked every frame as well as functions taking large structs, if possible modify the function definition to pass by reference. Learn more here


  1. Physics

    a) Generally, easiest way to improve physics is to limit the amount of time spent on Physics or the number of iterations per second. Of course, this will reduce simulation accuracy. See TimeManager in Unity

    b) The type of colliders in Unity have widely different performance characteristics. The order below lists the most performant colliders to least performant colliders from left to right. It is most important to avoid Mesh Colliders which are substantially more expensive than the primitive colliders.

     Sphere < Capsule < Box <<< Mesh (Convex) < Mesh (non-Convex)

    See Unity Physics Best Practices for more info

  2. Animations

    Disable idle animations by disabling the Animator component (disabling the game object won't have the same effect). Avoid design patterns where an animator sits in a loop setting a value to the same thing. There is considerable overhead for this technique, with no effect on the application. Learn more here.

  3. Complex algorithms

    If your application is using complex algorithms such as inverse kinematics, path finding, etc, look to find a simpler approach or adjust relevant settings for their performance

CPU-to-GPU performance recommendations

Generally, CPU-to-GPU performance comes down to the draw calls submitted to the graphics card. To improve performance, draw calls need to be strategically a) reduced or b) restructured for optimal results. Since draw calls themselves are resource-intensive, reducing them will reduce overall work required. Further, state changes between draw calls requires costly validation and translation steps in the graphics driver and thus, restructuring of your application's draw calls to limit state changes(i.e different materials, etc) can boost performance.

Unity has a great article that gives an overview and dives into batching draw calls for their platform.

Single pass instanced rendering

Single Pass Instanced Rendering in Unity allows for draw calls for each eye to be reduced down to one instanced draw call. Due to cache coherency between two draw calls, there is also some performance improvement on the GPU as well.

To enable this feature in your Unity Project

  1. Open Player XR Settings (go to Edit > Project Settings > Player > XR Settings)
  2. Select Single Pass Instanced from the Stereo Rendering Method drop-down menu (Virtual Reality Supported checkbox must be checked)

Read the following articles from Unity for details with this rendering approach.


One common issue with Single Pass Instanced Rendering occurs if developers already have existing custom shaders not written for instancing. After enabling this feature, developers may notice some GameObjects only render in one eye. This is because the associated custom shaders do not have the appropriate properties for instancing.

See Single Pass Stereo Rendering for HoloLens from Unity for how to address this problem

Static batching

Unity is able to batch many static objects to reduce draw calls to the GPU. Static Batching works for most Renderer objects in Unity that 1) share the same material and 2) are all marked as Static (Select an object in Unity and click the checkbox in the top right of the inspector). GameObjects marked as Static cannot be moved throughout your application's runtime. Thus, static batching can be difficult to leverage on HoloLens where virtually every object needs to be placed, moved, scaled, etc. For immersive headsets, static batching can dramatically reduce draw calls and thus improve performance.

Read Static Batching under Draw Call Batching in Unity for more details.

Dynamic batching

Since it is problematic to mark objects as Static for HoloLens development, dynamic batching can be a great tool to compensate for this lacking feature. Of course, it is can also be useful on immersive headsets as well. Dynamic batching in Unity can be difficult though to enable because GameObjects must a) share the same Material and b) meet a long list of other criteria.

Read Dynamic Batching under Draw Call Batching in Unity for the full list. Most commonly, GameObjects become invalid to be batched dynamically because the associated mesh data can be no more than 300 vertices.

Other techniques

Batching can only occur if multiple GameObjects are able to share the same material. Typically this will be blocked by the need for GameObjects to have a unique texture for their respective Material. It is common to combine Textures into one big Texture, a method known as Texture Atlasing.

Further, it is generally preferable to combine meshes into one GameObject where possible and reasonable. Each Renderer in Unity will have it's associated draw call(s) versus submitting a combined mesh under one Renderer.


Modifying properties of Renderer.material at runtime will create a copy of the Material and thus potentially break batching. Use Renderer.sharedMaterial to modify shared material properties across GameObjects.

GPU performance recommendations

Learn more about optimizing graphics rendering in Unity

Reduce poly count

Polygon count is usually reduced by either

  1. Removing objects from a scene
  2. Asset decimation which reduces the number of polygons for a given mesh
  3. Implementing a Level of Detail (LOD) System into your application which renders far away objects with lower-polygon version of the same geometry

Limit overdraw

In Unity, one can display overdraw for their scene, by toggling the draw mode menu in the top left corner of the Scene view and selecting Overdraw.

Generally, overdraw can be mitigated by culling objects ahead of time before they are sent to the GPU. Unity provides details on implementing Occlusion Culling for their engine.

Understanding shaders in Unity

An easy approximation to compare shaders in performance is to identify the average number of operations each executes at runtime. This can be done fairly easily in Unity.

  1. Select your shader asset or select a material, then in top right corner of the inspector window, select the gear icon and then "Select Shader"

    Select shader in Unity

  2. With the shader asset selected, click the "Compile and show code" button under the inspector window

    Compile Shader Code in Unity

  3. After compiling, look for the statistics section in the results with the number of different operations for both the vertex and pixel shader (Note: pixel shaders are often also called fragment shaders)

    Unity Standard Shader Operations

Unity Standard shader alternatives

Instead of using a physically based rendering (PBR) or other high-quality shader, look at utilizing a more performant and cheaper shader. Mixed Reality Toolkit provides a standard shader that has been optimized for mixed reality projects.

Unity also provides an unlit, vertex lit, diffuse, and other simplified shader options that are significantly faster compared to the Unity Standard shader. See Usage and Performance of Built-in Shaders for more detailed information.

Memory recommendations

Excessive memory allocation & deallocation operations can have adverse effects on your holographic application resulting in inconsistent performance, frozen frames, and other detrimental behavior. It is especially important to understand memory considerations when developing in Unity since memory management is controlled by the garbage collector.

Garbage collection

Holographic apps will loose processing compute time to the garbage collector (GC) when the GC is activated to analyze objects that are no longer in scope during execution and their memory needs to be released so it can be made available for re-use. Constant allocations and de-allocations will generally require the garbage collector to run more frequently thus hurting performance and user experience.

Unity has provided an excellent page that explains in detail how the garbage collector works and tips to write more efficient code in regards to memory management.

One of the most common practices that leads to excessive garbage collection is not caching references to components and classes in Unity development. Any references should be captured during Start() or Awake() and re-used in later functions such as Update() or LateUpdate().

Other quick tips:

  • Use the StringBuilder C# class to dynamically build complex strings at runtime
  • Remove calls to Debug.Log() when no longer needed as they still execute in all build versions of an app
  • If your holographic app generally requires lots of memory, consider calling System.GC.Collect() during loading phases such as when presenting a loading or transition screen

Object pooling

Object pooling is a popular technique to reduce the cost of continuous allocations & deallocations of objects. This is done by allocating a large pool of identical objects and re-using inactive, available instances from this pool instead of constantly spawning and destroying objects over time. Object pools are great for re-useable components that have variable lifetime during an app.

Startup performance

You should consider starting your app with a smaller scene, then using SceneManager.LoadSceneAsync to load the rest of the scene. This allows your app to get to an interactive state as fast as possible. Be aware that there may be a large CPU spike while the new scene is being activated and that any rendered content might stutter or hitch. One way to work around this is to set the AsyncOperation.allowSceneActivation property to false on the scene being loaded, wait for the scene to load, clear the screen to black, and then set back to true to complete the scene activation.

Remember that while the startup scene is loading the holographic splash screen will be displayed to the user.

See also