Work with shaders and shader resources

Article
01/05/2016

[ This article is for Windows 8.x and Windows Phone 8.x developers writing Windows Runtime apps. If you’re developing for Windows 10, see the latest documentation ]

It's time to learn how to work with shaders and shader resources in developing your Microsoft DirectX game for Windows 8. We've seen how the DirectX app template sets up the graphics device and resources, and perhaps you've even started modifying its pipeline. So now let's look at the pixel and vertex shaders that the template provides.

If you aren't familiar with shader languages, a quick discussion is in order. Shaders are small, low-level programs that are compiled and run at specific stages in the graphics pipeline. Their specialty is very fast floating-point mathematical operations. The most common shader programs are:

Vertex shader—Executed for each vertex in a scene. This shader operates on vertex buffer elements provided to it by the calling app, and minimally results in a 4-component position vector that will be rasterized into a pixel position.
Pixel shader—Executed for each pixel in a render target. This shader receives rasterized coordinates from previous shader stages (in the simplest pipelines, this would be the vertex shader) and returns a color (or other 4-component value) for that pixel position, which is then written into a render target.

The DirectX app template provides very basic examples of these two types of shaders as SimpleVertexShader.hlsl and SimplePixelShader.hlsl.

Shader programs are written in Microsoft High Level Shader Language (HLSL).HLSL syntax looks a lot like C, but without the pointers. Shader programs must be very compact and efficient. If your shader compiles to too many instructions, it cannot be run and an error is returned.

In Direct3D, shaders are not compiled at run time; they are always compiled to CSO files when the rest of the program is compiled. When you compile your app with Microsoft Visual Studio 2013, the HLSL files are compiled to CSO (.cso) files that your app must load. Make sure you include these CSO files with your app when you package it; they are assets just like meshes and textures.

Understand HLSL semantics

It's important to take a moment to discuss HLSL semantics before we continue, because they are often a point of confusion for new Direct3D developers. HLSL semantics are strings that identify a value passed between the app and a shader program. Although they can be any of a variety of possible strings, the best practice is to use a string like POSITION or COLOR that indicates the usage. You assign these semantics when you are constructing a constant buffer or input layout. You can also append a number between 0 and 7 to the semantic so that you use separate registers for similar values. For example: COLOR0, COLOR1, COLOR2...

Semantics that are prefixed with "SV_" are system value semantics that are written to by your shader program; your game itself (running on the CPU) cannot modify them. Typically, these semantics contain values that are inputs or outputs from another shader stage in the graphics pipeline, or that are generated entirely by the GPU.

Additionally, SV_ semantics have different behaviors when they are used to specify input to or output from a shader stage. For example, SV_POSITION (output) contains the vertex data transformed during the vertex shader stage, and SV_POSITION (input) contains the pixel position values interpolated during rasterization.

Here are a few common HLSL semantics:

POSITION(n) for vertex buffer data. SV_POSITION provides a pixel position to the pixel shader and cannot be written by your game.
NORMAL(n) for normal data provided by the vertex buffer.
TEXCOORD(n) for texture UV coordinate data supplied to a shader.
COLOR(n) for RGBA color data supplied to a shader. Note that it is treated identically to coordinate data; the semantic simply helps you identify that it is color data.
SV_Target[n] for writing from a pixel shader to a target texture or other pixel buffer.

We'll see some examples of HLSL semantics as we review the code from SimpleVertexShader.hlsl and SimplePixelShader.hlsl.

Read from the constant buffers

Any shader can read from a constant buffer if that buffer is attached to its stage as a resource. In the DirectX app, only the vertex shader (SimpleVertexShader.hlsl) has a constant buffer available to it.

The constant buffer is declared in two places: in the C++ code (ShaderStructures.h), and in the corresponding HLSL files that will access it (again, SimpleVertexShader.hlsl.)

Here's how the constant buffer is declared in the C++ code.

// Constant buffer used to send model-view-projection (MVP) matrices to the vertex shader.
    struct ModelViewProjectionConstantBuffer
    {
        DirectX::XMFLOAT4X4 model;
        DirectX::XMFLOAT4X4 view;
        DirectX::XMFLOAT4X4 projection;
    };

When declaring the structure for the constant buffer in your C++ code, ensure that all of the data is correctly aligned along 16-byte boundaries. The easiest way to do this is to use DirectXMath types, like XMFLOAT4 or XMFLOAT4X4, as seen in the example code. For more information about constant buffer alignment and packing, see Packing Rules for Constant Variables.

Now, here's how the constant buffer is declared in the vertex shader HLSL.

// A constant buffer that stores the three basic column-major matrices for composing geometry.
cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
    matrix model;
    matrix view;
    matrix projection;
};

All buffers—constant, texture, sampler, or other—must have a register defined so the GPU can access them. Each shader stage allows up to 15 constant buffers, and each buffer can hold up to 4,096 constant variables. The register-usage declaration syntax is as follows:

b*#*: A register for a constant buffer (cbuffer).
t*#*: A register for a texture buffer (tbuffer).
s*#*: A register for a sampler. (A sampler defines the lookup behavior for texels in the texture resource.)

For example, the HLSL for a pixel shader might take a texture and a sampler as input with a declaration like this.

Texture2D simpleTexture : register(t0);
SamplerState simpleSampler : register(s0);

Read from the vertex buffers

The vertex buffer supplies the triangle data for the scene objects to the vertex shader(s). As with the constant buffer, the vertex buffer is first declared in the C++ code of the DirectX app template, using similar packing rules.

    // Used to send per-vertex data to the vertex shader.
    struct VertexPositionColor
    {
        DirectX::XMFLOAT3 pos;
        DirectX::XMFLOAT3 color;
    };

However, there is no standard for the format of vertex data. The template provides a simple input layout that describes the format of a vertex to the vertex shader. If you add data to the vertex format when modifying the template code, be sure to update the input layout as well, or the shader will not be able to interpret it. For example, the original layout from the template is implemented like this.

static const D3D11_INPUT_ELEMENT_DESC vertexDesc [] =
        {
            { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D11_INPUT_PER_VERTEX_DATA, 0 },
            { "COLOR", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12, D3D11_INPUT_PER_VERTEX_DATA, 0 }
        };

DX::ThrowIfFailed(
            m_deviceResources->GetD3DDevice()->CreateInputLayout(
                vertexDesc,
                ARRAYSIZE(vertexDesc),
                &fileData[0],
                fileData.size(),
                &m_inputLayout
                )
);

You might modify it to a vertex that's defined like this.

    // Used to send per-vertex data to the vertex shader.
    struct VertexPositionColor
    {
        DirectX::XMFLOAT3 pos;
        DirectX::XMFLOAT3 normal;
  DirectX::XMFLOAT3 tangent;
    };

In that case, you'd modify the input-layout definition like this.

static const D3D11_INPUT_ELEMENT_DESC vertexDesc [] =
        {
            { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D11_INPUT_PER_VERTEX_DATA, 0 },
            { "NORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12, D3D11_INPUT_PER_VERTEX_DATA, 0 },
   { "TANGENT", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12, D3D11_INPUT_PER_VERTEX_DATA, 0 }
        };

DX::ThrowIfFailed(
            m_deviceResources->GetD3DDevice()->CreateInputLayout(
                vertexDesc,
                ARRAYSIZE(vertexDesc),
                &fileData[0],
                fileData.size(),
                &m_inputLayout
                )
);

Each of the input-layout element definitions is prefixed with a string, like "POSITION" or "NORMAL"—that is, a semantic. It's like a handle that helps the GPU identify that element when processing the vertex. Choose common, meaningful names for your vertex elements.

Just as with the constant buffer, SimpleVertexBuffer.hlsl has a corresponding buffer defined for the incoming vertex element. Note how the semantics match between the input layout definition and this HLSL buffer declaration. However, COLOR has a "0" appended to it. It isn't necessary to add the 0 if you have only one COLOR element declared in the layout, but it's a good practice to append it to any more color elements you choose to add in the future.

struct VertexShaderInput
{
    float3 pos : POSITION;
    float3 color : COLOR0;
};

Pass data between shaders

Shaders take input types and return output types from their main functions upon execution. For the vertex shader in the template, the input type is the VertexShaderInput structure (defined in the previous section). Vertex shader input should always match the definition of the vertex buffer input layout.

The vertex shader returns a PixelShaderInput structure, which must minimally contain the 4-component (float4) final vertex position. This position value must have the system value semantic, SV_POSITION, declared for it, so the GPU knows to rasterize that position to a pixel value. In the template, a color value is also returned and is interpolated between adjacent vertex color values during rasterization.

// Per-pixel color data passed through to the pixel shader as output from the vertex shader.
struct PixelShaderInput
{
    float4 pos : SV_POSITION;
    float3 color : COLOR0;
};

Review the vertex shader

The template's vertex shader is very simple: take in a vertex (position and color), transform the position from model coordinates into perspective projected coordinates, and return it (along with the color) to the vertex shader. The color value is used to determine the interpolated pixel color value provided to the pixel shader. Interpolation is the default behavior when rasterizing output values, and is essential in particular for the correct processing of output vector data (light vectors, per-vertex normals and tangents, and others).

// Simple shader to do vertex processing on the GPU.
PixelShaderInput main(VertexShaderInput input)
{
    PixelShaderInput output;
    float4 pos = float4(input.pos, 1.0f);

    // Transform the vertex position into projected space.
    pos = mul(pos, model);
    pos = mul(pos, view);
    pos = mul(pos, projection);
    output.pos = pos;

    // Pass the color through without modification.
    output.color = input.color;

    return output;
}

A more complex vertex shader, such as one that sets up an object's vertices for Phong shading, might look more like this.

// A constant buffer that stores the three basic column-major matrices for composing geometry.
cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
    matrix model;
    matrix view;
    matrix projection;
};

cbuffer LightConstantBuffer : register(b1)
{
    float4 lightPos;
};

struct VertexShaderInput
{
    float3 pos : POSITION;
    float3 normal : NORMAL;
};

// Per-pixel color data passed through the pixel shader.

struct PixelShaderInput
{
    float4 position : SV_POSITION; 
    float3 outVec : POSITION0;
    float3 outNormal : NORMAL0;
    float3 outLightVec : POSITION1;
};

PixelShaderInput main(VertexShaderInput input)
{
    // Inefficient -- doing this only for instruction. Normally, you would
 // premultiply them on the CPU and place them in the cbuffer.
    matrix mvMatrix = mul(model, view);
    matrix mvpMatrix = mul(mvMatrix, projection);

    PixelShaderInput output;

    float4 pos = float4(input.pos, 1.0f);
    float4 normal = float4(input.normal, 1.0f);
    float4 light = float4(lightPos.xyz, 1.0f);

    // 
    float4 eye = float4(0.0f, 0.0f, -2.0f, 1.0f);

    // Transform the vertex position into projected space.
    output.gl_Position = mul(pos, mvpMatrix);
    output.outNormal = mul(normal, mvMatrix).xyz;
    output.outVec = -(eye - mul(pos, mvMatrix)).xyz;
    output.outLightVec = mul(light, mvMatrix).xyz;

    return output;
}

Review the pixel shader

The pixel shader (defined in SimplePixelShader.hlsl) is quite possibly the absolute minimum amount of code you can have in a viable pixel shader. It takes the interpolated pixel color data generated by rasterization and returns it as output, where it will be written to a render target. How boring!

// A pass-through function for the (interpolated) color data.
float4 main(PixelShaderInput input) : SV_TARGET
{
    return float4(input.color, 1.0f);
}

The important part is the SV_TARGET system-value semantic on the main function. It indicates that the output is to be written to the primary render target, which is the texture buffer supplied to the swap chain for display.

An example of a more complex pixel shader to perform Phong shading might look like this.

cbuffer MaterialConstantBuffer : register(b2)
{
    float4 lightColor;
    float4 Ka;
    float4 Kd;
    float4 Ks;
    float4 shininess;
};

struct PixelShaderInput
{
    float4 position : SV_POSITION;
    float3 outVec : POSITION0;
    float3 normal : NORMAL0;
    float3 light : POSITION1;
};

float4 main(PixelShaderInput input) : SV_TARGET
{
    float3 L = normalize(input.light);
    float3 V = normalize(input.outVec);
    float3 R = normalize(reflect(L, input.normal));

    float4 diffuse = Ka + (lightColor * Kd * max(dot(input.normal, L), 0.0f));
    diffuse = saturate(diffuse);

    float4 specular = Ks * pow(max(dot(R, V), 0.0f), shininess.x - 50.0f);
    specular = saturate(specular);

    float4 finalColor = diffuse + specular;

    return finalColor;
}

In this extended example, the pixel shader takes its own constant buffers that contain light and material information. The input layout in the vertex shader would be expanded to include normal data, and the output from that vertex shader would include transformed vectors for the vertex, the light, and the vertex normal in the view coordinate system.

If you have texture buffers and samplers with assigned registers (t and s, respectively), you can access them in the pixel shader also.

Texture2D simpleTexture : register(t0);
SamplerState simpleSampler : register(s0);

struct PixelShaderInput
{
    float4 pos : SV_POSITION;
    float3 norm : NORMAL;
    float2 tex : TEXCOORD0;
};

float4 SimplePixelShader(PixelShaderInput input) : SV_TARGET
{
    float3 lightDirection = normalize(float3(1, -1, 0));
    float4 texelColor = simpleTexture.Sample(simpleSampler, input.tex);
    float lightMagnitude = 0.8f * saturate(dot(input.norm, -lightDirection)) + 0.2f;
    return texelColor * lightMagnitude;
}

Shaders are very powerful tools that can be used to generate procedural resources like shadow maps or noise textures. In fact, advanced techniques require that you think of textures more abstractly, not as visual elements but as buffers. They hold data like height information, or other data that can be sampled in the final pixel shader pass or in that particular frame as part of a multi-stage effects pass. Multi-sampling is a powerful tool and the backbone of many modern visual effects.

Next steps

Hopefully, you're comfortable with the DirectX app template at this point and are ready to start working on your project. Here are some links to help answer other questions you may have about Windows Store game development with DirectX and C++:

Use the Microsoft Visual Studio 2013 DirectX templates

Work with the DirectX app template's device resources

Understand the DirectX app template's rendering pipeline