Zip your IIS log files before transferring with Windows Azure Diagnostics
Summary: If your Windows Azure-hosted website is really popular, the IIS log files can start to get pretty big. To prevent them from filling up your local VM’s quota and to minimise storage size and transfer bandwidth, you can zip the log files before Windows Azure Diagnostics transfers them to blob storage.
One of the many nice things about Windows Azure Diagnostics is that it can automatically transfer your IIS log files to blob storage, read for you to pull down and analyse using your favourite log analysis tool. This capability is turned on by default in the diagnostics.wadcfg file you get when you create a new Windows Azure Cloud Service using the .NET SDK, and for most sites the default behaviour is just fine. But if your website is really popular, those log files can start to get very large. This could have a few implications:
- IIS log files on Windows Azure Cloud Service VMs are written to your “local resources” area on drive C:. This area has a quota. If you go beyond the quota, your site should keep running, but you may lose log files if they are cleaned up by the diagnostics agent to keep you under the quota.
- Once transferred to a blob container in Windows Azure Storage, you pay per megabyte stored. Even with today’s crazy cheap storage prices, the more you store, the more you pay.
- Similarly, you pay to transfer files from Windows Azure Storage to your local environment – both in dollars (Windows Azure bandwidth costs) and time.
Because IIS log files are plain text with lots of repetition, they compress very well (generally over 90% reduction). However unfortunately there’s no built-in way to compress IIS log files. But with the awesome power of PowerShell, Windows Azure startup tasks and the Windows Task Scheduler, rolling your own solution isn’t too hard. In fact it’s super easy as I’ve already written some sample code which you can download here. Disclaimer: This code is provided as-is and should be modified and tested to ensure it meets your requirements. Also, I’m actually pretty rubbish at PowerShell so I know the code could be improved a lot.
I won’t show all of the code in this blog post, but I will summarise how it works so you can customise it as needed.
First, we’ll need to declare a few things in our ServiceDefinition.csdef file :
- Allocate some space in our local resource areas to hold our zipped log files
- Run a startup task which will schedule a PowerShell script to run every hour to zip the log files
- Pass an environment variable to the above script that contains the path to our local resources area
Here are the relevant parts of the file:
<ServiceDefinition name="ZipLogFileTest" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition" schemaVersion="2013-03.2.0"> <WebRole name="WebRole1" vmsize="Small"> ... <LocalResources> <LocalStorage name="ZippedLogFiles" sizeInMB="1000" /> </LocalResources> <Startup> <Task commandLine="Startup\ScheduleLogFileZipAndDeleteTask.cmd" executionContext="elevated" taskType="simple"> <Environment> <Variable name="ZippedLogFilesPath"> <RoleInstanceValue xpath="/RoleEnvironment/CurrentInstance/LocalResources/LocalResource[@name='ZippedLogFiles']/@path" /> </Variable> </Environment> </Task> </Startup> </WebRole> </ServiceDefinition>
Next, we need a batch file which creates the scheduled task. Depending on your needs you may want to tweak the parameters, but in my example the task is run hourly. Note the weird quoting in the parameters to schtasks.exe which it seems to insist on to prevent weird errors. Also note that I am passing the path to the ZippedLogFilesPath as a parameter to the scheduled PowerShell script, as the environment variable isn’t accessible at the time the script runs.
rem ScheduleLogFileZipAndDeleteTask.cmd powershell.exe Set-ExecutionPolicy RemoteSigned -force schtasks.exe /create /sc HOURLY /tn ZipAndDeleteLogFiles /tr "\"powershell.exe\" -file \"%~dp0IISLogZipAndDelete.ps1\" %ZippedLogFilesPath%." /RL highest /F /RU SYSTEM >> ScheduleLogFileZipAndDeleteTask.out.log 2>> ScheduleLogFileZipAndDeleteTask.err.log
Now we need our PowerShell script. It’s a little long so if you want to go through it, download the full sample. But here’s the gist of what it does:
- Uses the WebAdministration PowerShell module to look up the IIS log path
- Iterate through each folder under this path (as IIS creates a second level of folders under the configured log path)
- Create a corresponding folder under the ZippedLogFiles resource folder (passed as a parameter to the script)
- Iterate through each file in the IIS log path
- Check if the last-modified-date of the file is more than 60 minutes old
- If so:
- Zip the individual log file
- Copy the zipped file to the corresponding subfolder under ZippedLogFiles
- Delete the original unzipped log file
- Iterate through each folder and file under ZippedLogFiles
- Check if the last-modified-date of the file is more than 1 day (1440 minutes) old
- If so, delete the file.
The PowerShell script depends on the System.IO.Compression.ZipFile class which comes with .NET Framework 4.5. Therefore you’ll need Windows Server 2012 (Windows Azure OS Family version 3) VMs to run this script. If you’re using an older OS, you may need to use a different approach to compress the file.
Finally, you need to configure Windows Azure Diagnostics to transfer files from our ZippedLogFiles folder, but ignore the standard IIS log files folder. You can do this by modifying diagnostics.wadcfg by removing the IISLogs element and adding DirectoryConfiguration element pointing to the local resource, for example:
<DiagnosticMonitorConfiguration configurationChangePollInterval="PT1M" overallQuotaInMB="4096" xmlns="http://schemas.microsoft.com/ServiceHosting/2010/10/DiagnosticsConfiguration"> <DiagnosticInfrastructureLogs /> <Directories bufferQuotaInMB="1024" scheduledTransferPeriod="PT30M"> <DataSources> <DirectoryConfiguration container="wad-iis-zippedlogs" directoryQuotaInMB="1024"> <LocalResource name="ZippedLogFiles" relativePath="."/> </DirectoryConfiguration> </DataSources> <CrashDumps container="wad-crash-dumps" /> </Directories> ... </DiagnosticMonitorConfiguration>
Now it’s simply a matter of using your app and waiting for the various timers to go off, and you should see your IIS log files beautifully gift-wrapped as zip files in the wad-iis-zippedlogs container in your configured Windows Azure Diagnostics storage account. Please let me know if you find this useful.