No Desktop Left Behind: SMS Troubleshooting Basics
At a Glance:
- SMS architecture
- Troubleshooting with status messages
- Getting package deployment back on track
Systems Management Server
Windows Server 2003
You’ve been given the task of distributing a software package to some computers in your organization. You’ve already created the package and advertisement in Systems
Management Server (SMS) 2003 and set everything in motion for the deployment. But as can happen from time to time, it appears that some of the machines are installing the software and others aren’t. So, how do you go about troubleshooting this issue? A better understanding of the way SMS 2003 delivers software will certainly help you determine the solution. As I begin to explore SMS here, I’ll assume that you have some basic understanding of SMS 2003 architecture, you know your site configuration, and you know how to create collections, packages, and advertisements.
To troubleshoot your problem, you need first to gather some information to narrow the focus. Did some machines get the package while others didn’t? What characteristics do these machines share? Are they on the same site, are they accessing the same distribution point, the same management point, the same client access point? You’ll need to determine if the problem lies on the server or client, so I’ll outline the process for the server-based distribution of the package and the client-based execution of the advertisement. I’ll talk about what files to look for and where they should be, and discuss what you can glean from the SMS logs. I’ll use a simple demo example, the Windows Server™ 2003 Management Tools software, to trace the progress of the package and its advertisement down to the client. Keep in mind that SMS includes some very rich Web reporting tools to help with troubleshooting. I just think it’s important to understand the fundamentals.
Before proceeding, make sure you have downloaded the SMS 2003 Troubleshooting Flowcharts. They are invaluable for understanding the process paths in SMS 2003.
The configuration I’ll be using includes two Microsoft® Windows® XP clients (WRK-LON-001 and WRK-LON-002) and one server running Microsoft Windows Server 2003 (LON-DCSMS-01), which will be hosting my server roles, domain controller, SQL Server™ installation, and obviously my SMS site server. Both the clients and the server are on the same subnet, which means they’re in the same SMS site.
I have also created a collection in my demo environment called Admin Systems. I used direct rules in the collection membership to add both of the workstations to the collection. This is the collection I will advertise to.
I’ll first create a package on the SMS site server for the Windows Server 2003 Administration Tools Pack, Adminpak.msi. Although my demo machine environment has all of the server roles in SMS hosted on a single machine, the principles of how the files get to the locations will be the same as they would be if the SMS roles were split onto other servers. Status messages and log files are also the same—in that case I would just have to check the logs and status messages on multiple machines. As you can see in Figure 1, the package ID is LON00001.
Figure 1** Windows Server 2003 Administration Tools Package **
Figure 2** Package Status **
Let’s take a look at the status messages that show the package being created and sent to the distribution point. To see this you need to drill down into the System Status folder, then into the Package Status folder in the SMS Admin Console. You can see in Figure 2 that the package has been targeted and installed on one distribution point. If you drilled down into the LON-London site icon that appears below the package description you would see all of the distribution points that had been selected for this package. In this example I only have one distribution point, which is also my site server, LON-DCSMS-01.
Although you can see that the package was installed to the distribution point successfully, let’s take a close look at the status messages by right-clicking on the package and clicking Show Messages | All; the status message viewer will open. There you can see several status messages (see Figure 3). Obviously in a more complex environment there would be many more, but in this case you have all the messages you need to see.
Figure 3** Package Status Messages **
The first message highlighted is Message ID 30000. If you double-click on the message it will show the status message details like the ones in Figure 4. As you can see, I was logged on as CONTOSO\administrator when I created the package.
Figure 4** Status Message Details **
If you click the Previous button and scroll through the status messages you will see six 30003 messages. These just indicate that the package has different programs created with it, in this case "Per-user attended," "Per-user unattended," "Per-system attended," and so forth. Now look at the Milestone Message, ID 2300. This message indicates that Distribution Manager is about to begin processing the package, which it does, and then follows up with 2311 and 2301 Message IDs.
Next you’ll find the Message ID 30009 followed closely by a message ID of 2300 indicating that Distribution Manager is beginning to process the package. Assuming all goes well with each distribution point assigned, this package will have associated 2342, 2329, and 2330 messages. The 2330 Message ID will state:
Message ID 2330 SMS Distribution Manager successfully distributed package "LON00001" to distribution point "["Display=\\LON-DCSMS-01\"]MSWNET:["SMS_ SITE=LON"]\\LON-DCSMS-01\"
The final status for this package is the 2301 Message ID, which would be generated once all the distribution points in the environment have successfully received the package (remember I only have one distribution point configured):
Message ID 2301 SMS Distribution Manager successfully processed package "Windows Server 2003 Administration Tools Pack" (package ID = LON00001).
So there you have it—a successful distribution of the package.
All of these status messages related to the SMS Distribution Manager are also listed under SITE STATUS | COMPONENT STATUS | SMS_DISTRIBUTION_MANAGER, but for my purposes here I can focus on the particular package status messages. Most of the issues you’ll see with the distribution point will be related to lack of disk space, or to SMS Distribution Manager’s inability to contact the distribution point, which could be related to a variety of things such as network connectivity issues, DNS configuration, and so on.
If the status messages don’t give you enough information, you can get much more granular and detailed information from the logs. On the site server, navigate to the logs directory (C:\sms\logs by default) and you can see all the log files SMS has created. Just open up the distmgr.log file by double-clicking on it.
Figure 5** Log File in SMS Trace **
Ugh! It opens up in Notepad—not very pretty and not very readable. No problem—I installed the SMS Tools on this site server and one of the included tools is SMS Trace. Go to START | All Programs | SMS Toolkit 2 | SMS Trace. The first time it’s run you will be prompted to make SMS Trace the default viewer for log files. Click Yes and continue. Now you can open the log files from within SMS Trace or just simply double-click on the log file in question and it will automatically open up SMS Trace. The resulting view of the log file is much more readable and useful, as you can see in Figure 5. A quick look at the Distribution Point folder on my site server will show the package files exist. In this case it’s at \\LON-DCSMS-01\SMSPKGC$\LON00001.
Distribution Manager also creates two files and a folder under the folder C:\SMS\inboxes\pkginfo.box:
LON0001.ico – Folder contains the icon files information related to the package LON0001.pkg – Package program detail information LON0001.nal – Location of distribution points
If the package needs to be sent to a child site, the Distribution Manager will write a package replication file (RPT) to the Replication Manager’s Inbox. The Replication Manager will send the package to the child site. Since I don’t have a child site I can’t go into the details of this process, but if your package isn’t making it to the child site then you need to start troubleshooting here. In this case, I can see that the package made it to the distribution point, so now it’s time to advertise it to the collection.
Advertising the Package
When you create an advertisement in SMS, a SQL trigger sets SMS in motion with a wake-up file to the Offer Manager’s Inbox. If you have downloaded the SMS 2003 Troubleshooting Flowcharts you can follow the process in the Software Distribution: Advertisements flowchart named SWDistAds. Offer Manager starts processing, and assuming the package is ready (this one is ready quickly because I chose a small package), Offer Manager generates an offer file, in this case LON20000.OFR. Offer Manager also evaluates the collection membership of the selected collection. If the collection contains Users, User Groups, or systems that have the Legacy Client, then it would generate .ins instruction files, and up to three lookup .lkp files (depending on whether the Legacy Client, User, or User Group will receive the advertisement).
Using the SMS 2003 Advanced Client, this is handled differently. My collection contains systems with the Advanced Client so you only have the .ofr file. The policy file for the client is updated on the Management Point.
Let’s see if my advertisement was successful or not. First let’s look at the status of the advertisement itself. In Figure 6 you can see that two clients received the advertisement and two clients started the program, but if you look over to the right you see there is a program error on one of the clients.
Figure 6** Checking Advertisement Status **
You need to look at the status messages in more detail. Right-click on the LON site and choose Show Messages. All you can see is that there are several messages for each of the machines in the Admin Systems collection. (I know, I know...there’s a message with a red X at the top of the list and that’s my error, but bear with me while I walk you through the status messages in Figure 7 for the distribution to WRK-LON-001 which you’ll find worked fine).
Figure 7** Advertisement Status Messages **
First, in the advertisement messages you see message ID 10002 for WRK-LON-001 showing that the advertisement was received from the site. This is followed by a message ID 10005 showing the program has started on the client—in other words, it’s installing (see Figure 8). Once the program has finished installing it sends back a status via a Management Information Format (MIF) file to the site server. You’ll then see the next status message ID 10009 indicating that it completed successfully (see Figure 9).
Figure 8Program is Installing
Message ID 10005 Program started for advertisement "LON20000" ("LON00001" - "Per-system unattended"). Command line: "C:\WINDOWS\system32\msiexec.exe" /q ALLUSERS=2 /m MSIUNQ1A /i "adminpak.msi" Working directory: \\LON-DCSMS-01\SMSPKGC$\LON00001\ User context: NT AUTHORITY\SYSTEM
Figure 9Installation Successful
Message ID 10009 The program for advertisement "LON20000" completed successfully ("LON00001" - "Per-system unattended"). The success description was "". User context: NT AUTHORITY\SYSTEM The program generated an installation status Management Information Format (MIF) file with a status value of Success. For more information, see the documentation for the program you are distributing.
Life is good. It worked to WRK-LON-001. The admin user on that machine isn’t going to be complaining that they don’t have the Administration tools, so you can relax. Wait a minute though. WRK-LON-002 was part of the Admin Systems collection and it doesn’t look like it went through the same experience as WRK-LON-001, so let’s look at the messages in the status viewer for WRK-LON-002.
The first status message you see is a message ID 10002—the advertisement was received. The next message ID 10005 tells me the program was started on the client, and that’s the same as for WRK-LON-001. So it appears all is well. But there’s no message ID 10009 to indicate a successful completion. In this case there’s a message ID 10007. Figure 10 shows the detail in the status message.
Figure 10Status Detail
Message ID 10007 The program for advertisement "LON20000" failed ("LON00001" - "Per-system unattended"). The failure description was "There is not enough space on the disk.". User context: NT AUTHORITY\SYSTEM Possible cause: The program generated an installation status Management Information Format (MIF) file with a status value of Failed. Solution: For more information about the failure, refer to the documentation for the program you are distributing.
It looks like the installation program sent a MIF file back with the status of Failed. In this case you can actually see in the status message why it failed; the drive is out of disk space. (I purposely built a file to use up all the free space leaving only a few megabytes free so that I could force an error.) In a production environment you either would have that user clean up his drive or put up the money to buy a huge new one.
That was a fairly quick and simple journey through the software distribution process, and even though I really only covered the basics, these concepts will work for you on a real-world problem, although it will probably be slightly more complex. Make sure you use the troubleshooting charts to guide you through the processes and utilize the status messages and the log files on the site server to gather detailed information about what’s happening on the system. With these tools at hand, you should be able to solve all your software distribution problems and keep your users happy.
John Baker works for Microsoft as an IT Pro evangelist in Atlanta Ga. John is also a presenter at TechNet conferences and seminars. You can reach him at email@example.com.
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.