Scenario Guide and Walk-Through

Article
12/09/2009

This scenario describes day-to-day administrative tasks to perform in a single domain Microsoft® Windows® 2000-based network. Specifically, it focuses on disk management and creating and implementing a backup and recovery plan. This scenario shows an administrator how to design a backup and restore strategy to ensure that the failure of a server or disk drive does not result in the loss of data.

Introduction

Systems for managing disks, backup, and disaster recovery are all critical needs for any company. Without the proper schedule for maintenance and proactive storage of data, a server may be down for days, weeks, or may not be able to recover critical data at all. The Microsoft® Windows® 2000 operating system contains a variety of tools and management consoles to make these management tasks much easier and more centralized.

Using the Disk Management snap-in of the Microsoft Management Console (MMC), administrators can quickly manage standard, fault tolerant, and volume sets and confirm the health of each volume. Disk management includes common Windows 2000 administrative tasks such as creating volumes, creating partitions, adding disks, managing drive letters and paths, managing mirror sets, compressing files and folders, defragmenting the drive, as well as error checking. All are available through one centralized utility.

Other MMC administrative tools allow you to view the device configuration, take a device configuration snapshot, enable remote storage, save and monitor services, as well as review event logs in Event Viewer to make certain your systems are healthy. The Removable Storage snap-in allows administrators to fully manage tape drives, CD-ROM drives, and other removable storage devices.

The Windows 2000 Backup utility contains new, more flexible options, allowing users to back up selected volumes or folders to tape or to file. Backup can also be secured in order to protect valuable company data. Scheduling is built directly into Backup, so there is no need to use a separate scheduler.

Adding an uninterruptible power supply (UPS) for power protection is as simple as following a wizard for identifying and configuring most UPS devices.

When disaster does strike, there are new safe mode boot options for restarting a server with and without various drivers, the Recovery Console for use by administrators in order to rename or replace individual files, as well as the Emergency Repair Disk. This tool also allows you to check and repair both the boot sector and the Master Boot Record.

Each of these tools enable administrators to protect critical data, all of them together enable the administrator to implement and maintain a true enterprise disaster recovery plan.

Scenario Requirements

The administrative tools are installed by default on all Windows 2000 domain controllers. On Windows 2000-based stand-alone servers or workstations, the Active Directory™ administrative tools are optional and can be installed from the Optional Windows 2000 components package.

This guide builds on the configuration achieved in earlier walkthroughs. Be sure you have successfully completed one of the following walkthroughs prior to proceeding.

Upgrading a Windows NT Domain to Windows 2000 Active Directory
Adding a New Windows 2000 for File/Print and Web Server to Your Network
Upgrading a Windows NT File/Print and Web to Windows 2000

· Full installation of Windows 2000 Server
· Installing Windows 2000 Server
· Upgrading Windows NT-based network to Windows 2000

Scenario Tasks

In this walkthrough, you will perform the following tasks.

Setup and Management Tasks

· Performing disk management, including common disk management tasks such as creating dynamic volumes, creating partitions, adding disks, managing drive letters and paths, managing mirror sets, compressing files and folders, defragmenting data.
· Backing up and restoring data, developing a backup and recovery strategy for data, scheduling regular backup

Recovery Planning

Prior to reviewing tasks to perform, the planning stages need to be outlined and discussed in detail. Different servers need to implement different fault tolerance and recovery options. The critical questions that need to be asked during the planning stage are as follows:

How critical is the data or information on a server?
Can automatic replication be set up quickly and easily?
If the server went down, what would be the impact on your business?
Is the server handling multiple functions?
If the server is a core-networking server, that is, a DNS or WINS server, is all of the data being backed up on a daily basis?

The role of the server will also be important to the type of recovery data required.

Domain Controllers

For each domain controller, there needs to be an additional domain controller to serve as a replication partner and provide fault tolerance. An additional domain controller lets you reinstall the Windows 2000 operating system, and reinstate the server as a domain controller by selecting Restore Domain Controller from the boot menu, accessed by choosing F8 in the boot menu.

On a domain controller, the primary partition containing the operating system and the registry information should be mirrored.

Each domain controller should perform a weekly tape backup and remove these tapes monthly to offsite storage. If the domain controller is also the primary Global Catalog Server, it should be backed up nightly and the tapes brought to offsite storage weekly.

WINS/DHCP/DNS Server

Each server should be replicated to another WINS/DNS server on a separate network node. Every week the WINS/DHCP/DNS servers should be backed up and once a month one backup should be stored at an offsite location.

File and Print Servers

Each server should have a RAID-5 disk array for data fault tolerance. There should be nightly back up of these servers to preserve data and copies should be stored off site weekly.

Application Servers

Often these servers are the most critical in any Information System environment. Each server should have a mirrored system drive separate from the drives housing the data and the application. The application and data drives should be on a RAID-5 array.

Backup of the system drive needs to be completed after any operating system upgrades, application upgrades, security changes, or changes to system configuration. Backup needs to be kept offsite. Backup of the data drives needs to be performed nightly and copies of the backup need to be stored off site weekly.

Other Considerations

Other forms of fault tolerance and recovery that are not covered in this chapter, but which should be considered on these servers include clustering, using RAID-5 volumes, hardware mirroring, and disk duplexing. Hardware fault tolerance, including RAID-5 and mirroring, is recommended if the company can afford it.

Test Backup on a regular basis by performing a data restore to a nonfunctional directory. An untested backup is of little value to the organization.

In mission critical development situations, Windows 2000 Professional-based systems should be treated in the same way as file and print servers.

Change Management

The other area that needs to be discussed when planning for basic server administration is change management. Very simply, this entails verifying that a change is needed, determining the change risk factors, testing the change to be made, and recording the change in the appropriate database and/or spreadsheet.

There are three reasons to upgrade or in some way alter the configuration of your server hardware or software. First, the addition of functionality that adds value to your organization. Second, due to a fault in software, hardware, or the combined architecture, an application is failing or the server is unable to fulfill its role. Finally, when a change in functionality or in the role of a server or workstation occurs within the organization.

One of the most critical aspects for change control is standardizing hardware and software. By standardizing, you can manage servers from a central location as well as easily determine whether fixes and upgrades need to be applied to all servers or are applicable to only one situation or functionality.

Even for tasks as simple as adding users, changing security, removing or adding TCP/IP addresses, or reconfiguring an application on a server, there must be change management. For these tasks, a Microsoft Access database or Microsoft Excel spreadsheet recording the date, nature of the change, and the details of the change need to be maintained.

Risk management is one of the most confusing subjects in Information Technology. Risk management seeks to measure the risk of implementation against the benefit. For example, in the hardware arena, if you need to add a disk to enable mirroring and are able to do so on the same SCSI chain without the addition of a new controller, the benefit becomes fault tolerance and data protection versus the loss of a SCSI ID. On the other hand, if a software company comes out with an upgrade to your most critical application that enables a user to use hotkey shortcuts, the benefit could be some timesavings for users versus the possible destabilization of the company's most critical application.

Before introducing new technology to a network, such as an operating system or hardware device, the administrator can create a small test lab that resembles the network infrastructure to test whether the new component will function correctly when brought live. The only exception to the requirement to test new components before deploying them throughout the network may be the addition of hotfixes that address specific rather than universal problems.

Testing can be challenging, but after a system is implemented, it can become routine and cost effective. Start with simple installation testing on standard hardware, if there are failures record them and repeat. If the failure occurs again, do not deploy until the failure is addressed. Once installation is confirmed, use a spreadsheet containing basic networking and universal applications to ensure that no issues appear with day-to-day use of the system. Finally, if the upgrade is designed to fix an issue, make certain it really does. Too often, corners are cut when hotfixes or upgrades are applied and the only tested for the designated purpose. It is important to test the day-to-day functionality as well.

Finally, every time a change is made to a computer, it must be recorded. It could be recorded in a physical notebook attached to the computer or in a spreadsheet or database available on a centralized share that is backed up nightly. If you keep a record of all changes made to a computer, you can trace all changes to the server in order to troubleshoot problems, as well as offer support professionals correct configuration information.

Disk Management Tasks

In this section, you will perform the following tasks:

Create a dynamic partition.
Manage a mirror set.
Create a RAID-5 array.
Run the Error Checker.
Defragment a drive.

Creating Dynamic Volumes

Windows 2000 Logical Disk Manager (LDM) introduces new disk functionality. In the Windows 2000 operating system, the type of disk supported by Windows NT 4 and earlier versions of the operating system is now referred to as a basic disk. These disks are managed by the Windows NT 4 version of LDM called FT disk. Basic disks cannot be extended online and in order to repair them you have to take the volume offline. Microsoft continues to provide support for basic disks so that existing RAID volumes that are migrated from computers running Windows NT 4 to computers running Windows 2000 continue to function without any change. Existing RAID volumes can be used and they can be repaired, however, if you want to perform any type of disk management, such as extending a volume, under Windows 2000 on these RAID volumes you have to upgrade the disks from basic to dynamic.

Dynamic disks are disks that are managed by the new Logical Disk Manager. Creating a dynamic volume using the Disk Management snap-in allows you to adjust partition sizes and span volumes on the fly. Dynamic disks are also self-describing which eases reconfiguration and recovery. What this means is that all of the disks in a dynamic disk set have a unique identifying signature and also a small database on every disk that keeps track of all of the disks that belong to the set. Providing that the spanned disk does not contain the system partition, no reboot is required so that you can maximize server access. A dynamic volume can only exist on a dynamic disk.

Note: Dynamic disks are not supported on portable computers. If you are using a portable computer and right-click a disk in the graphical or list view in Disk Management, you will not see the option to upgrade the disk to dynamic.

Use dynamic disks if your computer runs only Windows 2000 and if you want to use more than four volumes per disk, create fault-tolerant volumes, such as RAID-5, or add a new disk or format an existing disk on your server and log on as an Administrator. If you log on using an account that does not have administrative privileges, you may not be able to manage the disk subsystem.

To create a dynamic partition

Click the Start button; point to Run, and type MMC.
When the console is loaded, on the Console menu, click Add/Remove Snap-in, and then click Add in the Add/Remove Snap-in dialog box.
Click Computer Management, and click OK. Click the plus sign next to Storage, and click Disk Management. The Disk Management snap-in appears in the console, as illustrated in Figure 1.

Figure 1: The Disk Management snap-in
Make certain that the new drive is displayed in Disk Management. If not, on the Action menu, click Rescan Disks. If the new disk still does not appear, make certain the drive is properly installed, check connections and try again.
After the disk appears in the console, right-click the drive number and then click Upgrade to Dynamic Disk.
When complete, select the Drive marked Undefined and choose Mirror. When asked which partition to mirror, choose the System partition. Once complete you have now mirrored the system partition so that all system information including the boot sector is available on the mirrored partition.

To Break a mirror set, simply right click one of the mirrored partitions and select Break Mirror. This stops mirroring but retains the mirrored information on the second drive. If you do not need the mirrored information, right click the secondary partition and select Delete Mirror. If a mirror set shows Failed Redundancy, right click the mirror volume and click Repair Volume.

In order to create a fault tolerant RAID-5 array, also described as a stripe set with parity, add three drives to the system and follow the instructions to create the mirror. The only difference is clicking RAID-5 instead of Mirror. This serves to protect data in the event of catastrophic disk failure by writing a parity bit on one of the drives. When a disk fails, the disk can be replaced and regenerated.

You can use the Disk Management snap-in to perform other tasks, including managing drive letters and file and folder compression.

To manage drive letters

Right click a selected drive or partition, and then click Change Drive Letter and Path.
When the Change Drive Letter and Path dialog box appears, click Add to mount the drive from a different NTFS volume so that it can be accessed directly from another volume.
Click Edit to change the drive letter for the volume or partition.

Note: If you change the drive letter or path for the system or boot partitions, the system may become unusable.

To compress a volume to save space

Right click the selected drive or partition, click Properties, and then click the General tab.
Click Compress drive to save disk space.

Note: This is not recommended for system or boot partitions.

To perform error checking or to defragment a volume or drive

Right click the selected drive or partition, click Properties, and then click the Tools tab.
Click Error Checking, select the appropriate disk options, and press OK.

Note: On system drives, you must restart the computer in order to perform error checking because the volume cannot be mounted. On large disks, error checking can take a substantial period of time to perform.

To defragment a disk

Click Disk Defragmenter in the Disk Management snap-in, and then click Defragment.
In the Disk Defragmenter dialog box, click Analyze. The Disk Defragmenter provides information on whether or not the drive needs to be defragmented.

Note: Although NTFS is designed to automatically not fragment files, in circumstances where there is high file I/O, the disk is very full, or where applications are often installed and removed, fragmentation may still occur.
Click Defragment, and when complete, request the Report and save into a log directory as a benchmark for future reports. This will assist you in identifying disk problems and can act as an early warning sign for disk failure.

Note: Make certain that Disk Management is used on an on-going basis to audit drives so that disk failures do not cause the server to fail.

Backup Tasks

In this section, you will perform the following tasks:

Backup a drive to file and tape.
Perform a test restore.
Create an Emergency Repair Disk.
Run the Error Checker.
Back up the System State.

One of the most critical aspects of any disaster recovery plan is a providing for backup. In the Windows 2000 operating system, the new Backup utility allows you to back up to tape or to disk, when backing up individual folders, volumes, or systems. The scheduler is now built in.

Note: Make certain when purchasing tape backup units that they have the capability of backing up and storing system attributes and the registry as well as data.

To back up a folder to tape

On the Start menu, point to Programs, point to Accessories, then point to System Tools, and click Backup.
Insert a tape into the tape drive.
On the Backup page, click Backup Wizard. The Backup wizard welcome page appears, click Next.
Select Back up selected drives, files or network data, and click Next. Select a directory to backup and a check mark appears, then click Next.
On the Where to Store the Backup page, make certain the tape drive is selected and the correct media is specified, and then click Next.
On the Completing Backup Wizard page, click Advanced, verify that Normal is selected, and then click OK.
When the backup is complete, click Report and to see if there were errors or skipped files, then close the report and click Close.

To back up a folder to file

On the Start menu, point to Programs, point to Accessories, then point to System Tools, and click Backup.
On the Backup page, click Backup Wizard.
Click Back up selected drives, files or network data and then click Next. Select a directory to backup and a check mark appears, click Next.
On the Where to Store the Backup page, verify that File is specified and the Backup media or file name is pointing to the destination directory including the backup file name.
When the backup is completed, click Report in the Backup Progress dialog box to see if there were errors or skipped files, close the report, and then click Close.

One of the most important and often ignored data management tasks is testing a backup to verify that media and hardware are operating correctly.

To test and restore a backup

Create a new file folder in Windows Explorer with a unique name. To open Windows Explorer, click Start, point to Programs, and then click Windows Explorer.
From Programs, point to Accessories, point to System Tools, then click Backup, and click Restore Wizard.
Click the plus sign in front of File, and check the file to be restored.
In the Restore files to list, click Alternate location. In the Alternate location box, browse for the path to newly created file folder.
Click Start Restore and press OK. The file should now restore to the alternative location.

In Backup, you also need to create an Emergency Repair Disk. This needs to be updated each time your system configuration changes, including updates to hardware, Service Packs, and applications that install services.

To create the Emergency Repair Disk

Open Backup, and click Emergency Repair Disk.

Note: It is always a good idea to select the option to backup the registry to the Repair directory as well.
Insert a blank floppy disk into drive A, select the option to backup the registry and click OK.
Store the Emergency Repair Disk in a safe location for emergency restore procedures.

Another new feature in the Windows 2000 operating system is the ability to backup the boot files, COM+ Class Registration, and the registry from within Backup.

To back up system state data

Open Backup, and click the Backup tab.
In Click to select the check box for any drive, folder, or file that you want to back up, click the box next to System State. This will back up the System State data along with any other data you have selected for the current backup operation.

Store this file on a separate drive with the other system configuration information so that you can restore the data from another system in the event of disk corruption.

Other Administrative Tasks

In this section, you will perform the following tasks:

Record device and system configurations.
Save and configure Event Viewer.
Install an uninterruptible power supply.
Manage systems services.

Other administrative tasks that can aid you in recording information in order to recover a system configuration, improve system utilization, recover, or debug a system include:

Getting a snapshot of the devices and services.
Managing the logs in Event Viewer.
Using Remote Storage.
Installing of an UPS.
Setting the correct options in System Start Up and Recovery.

To record device and system configurations

Click Start, point to Run, and type MMC.
When the console is loaded, click Add/Remove Snap-in on the Console menu.
In the Add/Remove Snap-in dialog box, click Add, and then click Computer Management in the Add Standalone Snap-in dialog box.
Click the plus sign next to Computer Management, click the plus sign next to System Tools, and then click Device Manager to view all installed devices, as illustrated in Figure 2.

Figure 2: Using Device Manager to view installed devices
On the View menu, click Print.

This gives you a snapshot of all devices present in the system and all memory and interrupt addresses.

To view Application, Security, and System log files

In the Computer Management snap-in, click Event Viewer to view the Application, Security, and System log files.
To customize these settings, right click the log file you wish to change, and then click Properties. (For Windows 2000 Advanced Server security, make certain that Do not overwrite events is selected as illustrated in Figure 3.).
To exit, click Apply and OK.

Figure 3: Correct setting for Advanced Server security

Each week the event logs need to be saved to a remote location, preferably in the same location as the change configuration information and the device snapshot.

Another feature in the Windows 2000 Server operating system is Remote Storage. In order to use this feature, you must have an approved tape library directly connected to the server. Remote Storage allows you to store archival files on a media library in order to better use disk space on servers.

To enable Remote Storage

Click Start, point to Administrative Tools, and then click Remote Storage.
Follow the wizard's instructions for configuration and management of the service.

In order to enable this feature, the administrator has to index the data on the original volume, using the Windows 2000 Indexing Service. To open Indexing Service, open the Computer Management snap-in, and in the console tree, click Indexing Service. After indexing is complete, by right-clicking the volume, the administrator can migrate files in Properties (that is, the administrator can indicate that files need to be moved if they are not accessed in n days; also an administrator can designate a target percentage of free space on the volume, etc). For more information about Indexing Service, read the "Setting up Windows 2000 File/Print/Web Services" scenario.

Installation of an UPS device has become much easier in Windows 2000.

To install a UPS device

Click Start, point to Settings, and click Control Panel.
Double-click Power Options, and then click the UPS tab.
If your UPS device is not recognized, click Select, and select the Manufacturer and Port.
After installation is complete, click Configure to set alerts, run a shutdown script, and the time limit before shutting down.

For complete information, refer to the Best Practices section in online Help.

Once installed and configured, it is important to test your UPS configuration.

To test a UPS configuration

Simulate a power failure by disconnecting the power to the UPS device.
Wait until the UPS battery reaches a low level, at which point system shutdown should occur.
Restore power to the UPS device.
Check the System log in Event Viewer to ensure that all actions were logged and that there were no errors.

After initial installation and after any system modification, you also need to record the list of services running on the server and save this to a common share.

Open the Computer Management snap-in, click Services and Applications, and then in right pane, double-click Services.
On the Action menu, click Export List and in the Save as box, enter the server and share where you are storing all server reference files and create a file name containing the computer name and date.
If a service fails or does not respond, in the right pane of the console, right click the appropriate service, and click Restart.

One final task to be covered is looking at the System Startup and Recovery options, which allow you to set the correct size for a memory dump file, location, as well as other options. The Startup and Recovery feature also helps you find information on system problems, as well as alert you when a server has gone down.

To specify Startup and Recovery settings

On the Start menu, point to Settings, and click Control Panel.
In Control Panel, double-click the System icon, and in the System Properties dialog box, click the Advanced tab, and click Startup and Recovery.
In the System Failure portion of the dialog box, click Write an event to the system log, Send an administrative alert, and Automatically reboot.
In the Write Debugging Information list, select Small Memory Dump, which gives the stop code and essential error information, and specify the location on the boot drive where you want to store the dump

If you get multiple stop screens and need more information to trace the error, choose the full memory dump option. You will need enough space on a selected drive equivalent to your system RAM as well as 20 megabytes of space on the system drive.

Note: For more information on stop screens and decoding stop screens, refer to the Windows 2000 Professional Guide on the installation CD.

Boot Recovery Tasks

In this section, you will perform the following task:

Enable boot logging and interpreting logs.
Use the Windows 2000 boot menu.
Load the Repair Console.
Repair the Master Boot Record and boot sector.

If a catastrophic event occurs that prevents the server from starting normally, Windows 2000 has several options to repair and/or recover the system. These include the Emergency Repair Disk, various boot options, and the Recovery Console. From the boot menu, press F8 to access the full menu of boot options.

Here we will review boot logging and the Recovery Console.

On the boot menu, click Enable Boot Logging, which creates a boot log file, ntbtlog.txt, at the system root(C:\winnt). In the event that the system starts to load but cannot finish, you can now access this file in several ways. Complete the boot process and log on so that the file is created.

Return to the boot menu and start the system in Safe Mode. Because the file we are looking for is located on the local system, there is no need to access the network.

After the system has restarted, open Notepad and view ntbtlog.txt: You will see a full list of drivers and services started during the boot process. Save this file as a boot snapshot. This helps in the future since it allows you to see not only which file was loaded last, but also which service or driver was the next to be loaded.

In order to replace a corrupt or missing file, you will use the Recovery Console.

To open the Recovery Console

Start the computer with the Windows 2000 CD-ROM.
On the Welcome to Setup page, press R to Repair, and then C to start the Recovery Console. The Recovery Console can also be used to repair the Master Boot Record, the boot sector, or create and format partitions.
Once you select the installation to log on to, you need to log on as administrator. No other user has access to the console.
In the Recovery Console, make certain that the prompt is pointing to the system root (C:\winnt), and rename ntbtlog.txt to ntbtlog.org. The same task can be accomplished on a system level file in order to replace it.

Note: When using the Recovery Console, it is always a good idea to move or rename a file when replacing it. This way the file may still be used at a later time if it is not the root of a problem.

In the event of an inability to boot from the boot drive, there are commands in the Recovery Console, including Fixboot and FixMBR to repair or replace the boot sector on a drive. These commands can be used on the primary disk or a secondary disk can be inserted into a system and have the boot sector repaired.

Note: Care must be used since certain viruses overwrite the sector between the MBR and the boot sector, redirecting boot to a secondary location. If this has occurred, replacing the MBR can cause permanent loss of the partition information. When unable to boot from a drive always use a virus scanner prior to using the FixMBR command.

Summary

This walkthrough covered several of the disaster recovery and change management aspects relating to Windows 2000. Effective disaster recovery and change management require a combination of system control, data protection, and good backup.

Disk Management gives you the ability to mirror drives or utilize RAID-5 data redundancy, but when the data is mission critical consider using a combination of both hardware and software disk systems. The other part of Disk Management is the maintenance of data integrity by defragmenting files, and error checking the system drives. It is important to defragment and to run error checking any time a Service Pack or system upgrade is applied.

A fully tested backup is the single most valuable recovery tool. Rotating information off site, testing the backup, and backing up system files to a shared drive are the important points to remember. Within the Backup utility is also the Emergency Repair Disk, which is your first line of defense if your system is no longer able to load Windows 2000. This must be updated each time an upgrade, Service Pack, or fix is applied to the server.

Other administrative tasks include the device map, system service maps, event logs and installation of an uninterruptible power supply. The logs and maps should be kept on the recovery share in order to have a record of a healthy server. Again, whenever a change is made on the server, it is important to update these on the share. The installation of an UPS protects the system and ensures a graceful system shutdown in the event of power failure. Just as with a backup, it is only useful after testing and configuring correctly.

Finally, if the operating system fails, Windows 2000 allows the administrator to start the system using various boot menu options and repair options. The administrator gains the ability to boot with or without a network, with minimum drivers and services, and can launch the Repair Console independently of the operating system. It is important to have a strong administrator password in the Windows 2000 operating system users of the Repair Console can modify core files affecting the operating system. When using the Repair Console, the most important lesson is that when changes are made always preserve the original files and use the rename or move commands rather than delete.

Scenario Guide and Walk-Through

On This Page

Introduction

Scenario Requirements

Scenario Tasks

Recovery Planning

Domain Controllers

WINS/DHCP/DNS Server

Change Management

Disk Management Tasks

Creating Dynamic Volumes

Backup Tasks

Other Administrative Tasks

Boot Recovery Tasks

Summary

Additional resources