A version of this page is also available for
You can think of the projection transformation as controlling the camera's internals; it is analogous to choosing a lens for the camera. This is the most complicated of the three transformation types.
A viewing frustum is 3-D volume in a scene positioned relative to the viewport's camera. The shape of the volume affects how models are projected from camera space onto the screen. The most common type of projection, a perspective projection, is responsible for making objects near the camera appear bigger than objects in the distance. For perspective viewing, the viewing frustum can be visualized as a pyramid, with the camera positioned at the tip. This pyramid is intersected by a front and back clipping plane. The volume within the pyramid between the front and back clipping planes is the viewing frustum. Objects are visible only when they are in this volume.
If you imagine that you are standing in a dark room and looking through a square window, you are visualizing a viewing frustum. In this analogy, the near clipping plane is the window, and the back clipping plane is whatever finally interrupts your view — the skyscraper across the street, the mountains in the distance, or nothing at all. You can see everything inside the truncated pyramid that starts at the window and ends with whatever interrupts your view, and you can see nothing else.
The viewing frustum is defined by fov (field of view) and by the distances of the front and back clipping planes, specified in z-coordinates.
In this illustration, the variable D is the distance from the camera to the origin of the space that was defined in the last part of the geometry pipeline — the viewing transformation. This is the space around which you arrange the limits of your viewing frustum.
The Projection Transformation
The projection matrix is typically a scale and perspective projection. The projection transformation converts the viewing frustum into a cuboid shape. Because the near end of the viewing frustum is smaller than the far end, this has the effect of expanding objects that are near to the camera; this is how perspective is applied to the scene.
With the viewing frustum, the distance between the camera required by the projection transformation and the origin of the space defined by the viewing transformation is defined as D. A beginning for a matrix defining the perspective projection might use this D variable like this:
The viewing matrix puts the camera at the origin of the scene. Because the projection matrix needs to have the camera at (0, 0, -D), it translates the vector by -D in the z-direction, by using the following matrix.
Multiplying these two matrices gives the following composite matrix.
The following illustration shows how the perspective transformation converts a viewing frustum into a new coordinate space. Notice that the frustum becomes cuboid and also that the origin moves from the upper-right corner of the scene to the center.
In the perspective transformation, the limits of the x- and y-directions are -1 and 1. The limits of the z-direction are 0 for the front plane and 1 for the back plane.
This matrix translates and scales objects based on a specified distance from the camera to the near clipping plane, but it does not consider the field of view (fov), and the z-values that it produces for objects in the distance can be nearly identical, making depth comparisons difficult. The following matrix addresses these issues, and it adjusts vertices to account for the aspect ratio of the viewport, making it a good choice for the perspective projection.
In this matrix, Zn is the z-value of the near clipping plane, and Zf is the z-value of the far clipping plane. The variables w, h, and Q have the following meanings. Note that fovw and fovh represent the viewport's horizontal and vertical fields of view, in radians.
For your application, using field-of-view angles to define the x and y scaling coefficients might not be as convenient as using the viewport's horizontal and vertical dimensions (in camera space). As the math works out, the following two formulas for w and h use the viewport's dimensions, and are equivalent to the preceding formulas.
In these formulas, Zn represents the position of the near clipping plane, and the Vw and Vh variables represent the width and height of the viewport, in camera space.
For a C++ application, these two dimensions correspond directly to the Width and Height members of the D3DMVIEWPORT structure.
Whatever formula you decide to use, it is important that you set Zn to as large a value as possible, as z-values extremely close to the camera do not vary by much. This makes depth comparisons using 16-bit z-buffers somewhat complicated.
As with the world and view transformations, you call the IDirect3DMobileDevice::SetTransform method to set the projection transformation.
W-Friendly Projection Matrix
Microsoft® Direct3D® Mobile can use the W component of a vertex that has been transformed by the world, view, and projection matrices to perform depth-based calculations in depth-buffer or fog effects. Computations such as these require that your projection matrix normalize W to be equivalent to world-space Z. In short, if your projection matrix includes a (3,4) coefficient that is not 1, you must scale all the coefficients by the inverse of the (3,4) coefficient to make a proper matrix. If you do not provide a compliant matrix, fog effects and depth buffering are not applied correctly. The matrix shown above in The Projection Transformation is compliant with w-based calculations.
The following illustration shows a noncompliant projection matrix, when e<>1, and the same matrix scaled so that eye-relative fog will be enabled.
In the preceding matrices, all variables are assumed to be nonzero.
Direct3D Mobile uses the currently set projection matrix in its w-based depth calculations. As a result, applications must set a compliant projection matrix to receive the desired w-based features, even if they do not use the Direct3D Mobile transformation pipeline.