Path Format Overview
As promised, here is the start of the look into paths on Windows. I'll keep things simple at first and layer on complexity in additional posts. In this post we'll look at a top level overview of the path formats in use in Windows.
Let's start with the DOS path format that has been with us since DOS 2.0 (subdirectories did not exist in 1.0):
The path is made up of three components that are broken up by the backslash character (technically a component separator, but usually called a directory separator). The first component is the volume, or drive. The second component is a directory name. The final component is a file name.
A fully qualified or absolute DOS path must be composed of at least a full volume name. For DOS paths this means a drive letter (a..z), a volume separator (:), and a directory separator. If it doesn't start with all 3 characters it is considered to be partially qualified or relative to the current directory in some way (or a prior current directory from another drive).
The next path format came (via LanManager) from a need to access network resources, the Universal Naming Convention (UNC).
UNCs are identified by the fact that they start with two separators. The first component is the host name (server), which is followed by the share name. Server names can be NetBIOS machine names or IP/FQDN addresses (IPv4 as well as v6 are supported). The two together make up the volume. The rest of the path is the same as the previous path.
If a UNC doesn't contain a full server and share it is not relative, it is simply invalid. You can't set the current directory to a UNC (but you can, however, map a given UNC to a drive letter to use relative paths with shares).
DOS Device Paths
Windows NT (every current Windows OS is NT based) has a unified object model that points to all resources, including files. These NT object paths are not directly accessible from the Windows APIs (and consequently the CMD shell, file explorer, etc.). They are, however, exposed to the Win32 layer through a special folder of symbolic links that legacy DOS and UNC paths are mapped to. This special folder is accessed via the DOS Device path syntax, which is one of:
The \\.\ or \\?\ identifies the path as a DOS device path. The next component (C: in this case) is a symbolic link to the "real" NT device object. There is a specific link for UNCs called, not surprisingly, "UNC".
Like UNCs, DOS device paths are fully qualified by definition. Current directories never enter into their usage.
Terminology around DOS device paths and explanations of how they work are seriously lacking. I'll go into how these and all of the other path formats translate into the final NT path in later posts.
Normalization. Most paths get normalized, which includes processing partially qualified paths and relative components (. and ..). Tune in next time for the deep dive.
Naming Files, Paths, and Namespaces (MSDN)
[MS-DTYP] 2.2.57 UNC [MS-FCCC] 2.1.5 Pathname [MS-FCCC] 2.1.5 Share name [MS-FSA] 5 Appendix A: Product Behavior MS-DOS 2.0: An Enhanced 16-Bit Operating System
Stupid DOS Tricks
A small fraction of the ways you can refer to the same file:
Volume in drive C is OS Volume Serial Number is 0000-0000 Directory of c:\test 04/20/2016 07:00 PM 13 Foo.txt 1 File(s) 13 bytes 0 Dir(s) 56,278,192,128 bytes free
Volume in drive \\127.0.0.1\c$ is OS Volume Serial Number is 0000-0000 Directory of \\127.0.0.1\c$\test 04/20/2016 07:00 PM 13 Foo.txt 1 File(s) 13 bytes 0 Dir(s) 56,278,192,128 bytes free
I'll spare you the output on the rest of these.
C:\>dir \\LOCALHOST\c$\test\foo.txt C:\>dir \\.\c:\test\foo.txt C:\>dir \\?\c:\test\foo.txt C:\>dir \\.\UNC\LOCALHOST\c$\test\foo.txt C:\>dir \\127.0.0.1\c$\test\foo.txt