Windows Azure Virtual Machines – Gotcha’s!
This blog post is an attempt to summarize most common issues we’ve seen Windows Azure Virtual Machine users are running into and common questions that come up during our discussions with Virtual Machines customers.
1) Virtual machine disappeared or was deleted
When you hit the usage limit on your account, for example with a 90-day trial or the monthly access from an MSDN subscription, your storage accounts are set read-only, the VHD files for your VMs remain in the storage account as-is, but any deployed VMs are removed. This is done because deployed VMs reserve capacity even when stopped. Trial account users can enable pay-as-you-go, and for accounts with monthly limits like with an MSDN subscription, you will be able to deploy VMs from the existing VHDs when the next month’s billing cycle begins and the next month of usage is available. I’ve detailed this issue and mitigations in this blog post
Important Note: 90 day trial is associated with various quotas documented here. After a specific quota(s) are reached you will not be able to use the related services unless you enable charges on your subscription. So depending on the usage of the trial subscription, you may run out of quotas way before 90 days. For example, 90 day trial comes with 750 small compute hours. That means you can run one single small VM for 750 hours. So, if you were to create 4 extra large VMs, these VMs will run for 23+ hrs before they are deleted.
2) Underlying VHDs are not deleted by default when you delete virtual machines
VHD files for Azure VMs are blobs in your Azure storage account that are registered as disk objects and attached to a VM when a VM is created (in the case of the OS disk) or when you manually add one (in the case of data disks). Removing a VM only makes it so the disk objects are no longer attached to the a VM, but the disk objects still remain as do the physical VHD files in your storage account (which will continue to incur storage fees).
To delete virtual machines along with related resources, you need to: Delete virtual machines, Delete Disks, Delete VHDs
To remove disk objects and if desired, the physical VHD also, select Virtual Machines, then Disks, highlight the relevant disk and click Delete Disk which will then show you two options – Delete the associated VHD and Retain the associated VHD.
- To remove both the disk object and the physical VHD in your storage account, select Delete the associated VHD.
- If you want to keep the physical VHD file but just remove the disk object, select Retain the associated VHD.
Screenshot showing the Delete the associated VHD and Retain the associated VHD options that are displayed when clicking Delete Disk in the Disks section within Virtual Machines.
Note: Deleting VHDs may result in errors and may require you to break the lease to recover from the issue. Refer to point # 7 for more details.
3) Beware of “Temporary Disk” usage
We have seen a few customers that have used the D: drive to store important data, not realizing it is temporary storage.
The Temporary Storage (D:) drive is not persisted to Windows Azure storage as OS disks and data disks are. Data on the D: drive will be lost when the VM is moved to a different host server, when the host is updated or if it experiences a hardware failure or VM is moved to different host due to resize operation. The D: drive is only intended for storing temporary data (as indicated by the Temporary Storage volume label on the drive) such as temporary logs or databases. You must not use this temporary disk to store any data that you are not willing to lose. One key benefit that may motivate developers to use temporary disk is performance. I/O performance for temporary disks is higher than the IO permanence to OS disks, Data Disks. The reason is data is not persisted to Windows Azure Storage.
When using the D: drive for SQL TEMPDB, you will need to recreate the target folder when the VM is migrated to a different host, otherwise SQL may fail to start. For more information see the following article on the TechNet wiki: Change TEMPDB to Temporary Drive on Azure SQL IaaS
4) Uploading existing VHDs
Customers can use various methods to upload VHDs to Windows Azure Storage. We recommend uploading VHDs with the Add-AzureVHD cmdlet from Azure PowerShell 0.6.9 or later, or with Csupload.exe (1.7 or later) from the Azure SDK.
Please make sure of the following when uploading a VHD for use with Azure VMs:
- A VM must be generalized to use as an image from which you will create other VMs. For Windows, you generalize with the sysprep tool. For Linux you generalize with the Windows Azure Linux Agent (waagent). Provisioning will fail if you upload a VHD as an image that has not been generalized.
- A VM must not be generalized to use as a disk to only use as a single VM (and not base other VMs from it). Provisioning will fail if you upload a VHD as a disk that has generalized.
- When using third-party storage tools for the upload make sure to upload the VHD a page blob (provisioning will fail if the VHD was uploaded as a block blob). Add-AzureVHD and Csupload will handle this for you. It is only with third-party tools that you could inadvertantly upload as a block blob instead of a page blob.
- Upload only fixed VHDs (not dynamic, and not VHDX). Windows Azure Virtual Machines do not support dynamic disks or the VHDX format.
Note: Using CSUPLOAD or Add-AzureVHD to upload VHDs automatically converts the dynamic VHDs to fixed VHDs.
- Maximum size of VHD can be up to 127 GB. While data disks can be up to 1 TB, OS disks must be 127 GB or less.
- The VM must be configured for DHCP and not assigned a static IP address. Windows Azure Virtual Machines do not support static IP addresses.
5) Unable to re-use the previous DNS name of VM even though VM is deleted
Deleting a VM does not remove the associated cloud service automatically and hence you are unable to re-use the same DNS name as previous VM. To reuse the DNS name, explicitly delete the cloud service in the Cloud Services section of the management portal, using Remove-AzureService from Azure PowerShell, Azure Service Delete from the Azure CLI tool, or the Delete Hosted Service API (the terms Cloud Service and Hosted Service are synonymous in this scenario).
Confused about cloud service? Let me explain.
When you create an Azure VM, a cloud service is created automatically with the DNS name you’ve provided for the VM. Currently, new HTML5 management portal does not list this cloud service under "Cloud Services” section if it contains a single VM. You will see the cloud service there if it contains no VMs (is empty) or multiple VMs. You can check the cloud service by using the Azure PowerShell Get-AzureService cmdlet. You can group or connect up to 50 virtual machines under a single cloud service.
6) Securing Windows Azure Virtual Machines
In on-premise environments, security is an critical aspect when building machines, VMs. Building VMs in cloud is no different and one must take important measures to protect the VMs. This blog post summarizes the best practices to be used to protect Windows Azure Virtual Machines
7) Failed to start virtual machine xxxxxx.
While provisioning virtual machines, you may run similar error to below.
Failed to start virtual machine xxxxxx.
The operation cannot be performed because the virtual machine is faulted. The long running operation tracking ID was: 8d03b3e18cac33e5a52996b68b6aa16g.
We have one known issue that may cause this behavior and our internal teams are working very hard to get to the fix deployed to resolve the issue. Until then, you may resize the virtual machine to workaround the issue quickly. If the issue persists after resizing, you may open a new forum thread using the forum link(s) provided under "Production workloads and support during Windows Azure Virtual Machines “Preview” section .
Update: Windows Azure Virtual Machines went into General availability status on 16th April 2013. This issue has been resolved. If you are seeing this issue, please contact Windows Azure Support
8) Failed to create virtual machine linuxvm1.
While provisioning Linux based virtual machines, you may run into similar error to below.
Successfully created virtual machine linuxvm1.
Could not provision the virtual machine linuxvm1.
Linux defines a set of user names that you can't use and if you are using one of those user names, you will see the above error. Refer to https://www.windowsazure.com/en-us/manage/linux/other-resources/user-names-in-linux/ for more details, user names that cannot be used for Linux images.
9) Capture Versus Snapshot
Capturing a VM creates an image(not meant for backup) that can be used to create multiple VMs based on that same image. You can capture a VM using the Capture option in the portal, the Save-AzureVMImage Azure PowerShell cmdlet, using azure vm capture in the Azure CLI tool, or the Capture Role API.
For VM backups, there is no equivalent to the Hyper-V snapshot feature for Azure VMs. However Azure storage has a blob snapshot feature that allows you to create a backup of the VHD blob in Azure storage. Microsoft does not currently provide a tool for creating blob snapshots, though third-party storage tools such as CloudXplorer include this feature. And you can write custom code to call the Snapshot Blob API to create a blob snapshot.
You can also create a copy of a VHD using Azure storage tools. If the tool uses the 2012-02-12 version or later of the Copy Blob API, it will allow for fast cross-account blob copies, for example to move a VHD between different storage accounts. Most of the commonly used Azure storage tools also allow you to download files to on-premises. This forum post has steps to download the VHD using CloudXplorer. You can use any similar storage tools to perform the same task.
10) How to break the lease on VHDs/blobs?
The Windows Azure platform holds an infinite lease on all the page blobs that it considers disks in your storage account so that you don’t accidently delete the underlying page blob, container, or storage account while the VHD is in use by the VM. If you want to delete the underlying page blob, the container it is within, or the storage account, you will need to detach the disk from the VM first or delete the VM and associated disk object.
In few scenarios, you may end up getting errors while deleting VHDs even though there are no disks/VMs referring to the VHD. In such cases you can manually break the lease using the powershell script. Craig Landis has a detailed forum post describing these errors, workarounds along with the script.
11) RDP connectivity issues
RDP connectivity issue is very common issue customer encounter and this could be due to several different factors ranging from simple client firewall issue to a platform issue. You should start eliminating all client side issues first.
- Investigate client-side firewall issues by pinging the TCP port for the RDP endpoint using tools such as PsPing, PortQry, Telnet, or Nmap. See if your machine allows outbound communication to the RDP endpoint (TCP port 3389 for the first VM deployed to a cloud service, a random ephemeral port between 49152-65535 for additional VMs in the same cloud service). Verify the port on the Endpoints section for the VM in the portal. Since corporate firewalls often block 3389 and/or the ephemeral range (49152-65535), try connecting to the VM from a different network – from home, a wifi hotspot, or a mobile broadband connection.
- Drew McDaniel has a forum sticky post with common issue like cache credentials, endpoint related issues.
- If you none of the above resolved connectivity issue, you can quickly try restarting/resizing the VMs to see if the problem goes away. If the problem persists you can reach out to the support forum.
Important Note: If you changed your firewall settings on VM to block the RDP traffic or if you stopped RDP related services on the virtual machine, you will end-up losing connectivity to the VM. There is no easy to revert these settings. You would need to follow the below steps(involves lot of work) to revert those settings.
- Download the disk to on-premise environment
- Create a VM, attach the disk
- Change the firewall setting to allow the RDP traffic or enable the Remote Desktop Services
- Re-upload the disk to Azure and create a new Windows Azure Virtual Machine.
12) Platform updates to VM, restarts, shutdowns.
Windows Azure updates the host OS approximately once in every 1-3 months to keep the environment secure for all applications, virtual machines running on the platform. This update process may result in your VM to restart. You can use availability sets to ensure high availability for your applications running on virtual machines. Managing the high availability is detailed here. Mark Russinovich has posted a great blog post which explains Windows Azure Host updates in detail.
In addition to platform updates, Windows Azure service healing occurs automatically when the Windows Azure detects problematic nodes and moves these VMs to new nodes. When this occurs, you loose connectivity to VM during the service healing process and after the service healing process is completed, when you connect to VM, you will likely to find a event log entry indicating VM restart/shutdown (either gracefully or unexpected)
We recommend you to use availability groups to ensure high availability for your applications running in the VMs. Guidance for managing availability using availability sets can be found here
13) VM activations
While activating the OS on windows azure virtual machines, you may run into an error message stating the “A problem occurred when Windows tried to activate” Error Code 0xC004F074 with below details
"The software licensing service reported that the computer could not be activated. No key management service could be contacted"
We are aware of this issue and working towards resolving this issue. Please note that activation status does not impact the services running on the server. Not activating will generate persistent notifications reminding you to activate the server. Services and remote administration are not affected.
Refer to forum post for more details.
Update: Platform bug has been identified and fixed to resolve this issue. If you are still seeing this issue, it is possible guest firewall is preventing connections to activation server. Try disabling the guest firewall and re-enable after activation is complete.
1. A registry export of “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\SoftwareProtectionPlatform” for backup purposes
2. Disable the Windows Firewall by running the command “netsh advfirewall set allprofiles state off” and proceed with activation by executing “SLMGR –ATO"
3. Re-enable the firewall after activation is completed by running the command “netsh advfirewall set allprofiles state on”
14) Production workloads and support during Windows Azure Virtual Machines “Preview”.
While some customers have chosen to run production workloads on Windows Azure Virtual Machines, we currently do not recommend that because the feature is still in preview and meant for testing workloads so that they can be migrated easily to Virtual Machines after the feature is moved to GA(General Availability) status.
Also note that during the preview, support is provided via forums only. Support issues should be posted to one of the following forums:
Update: Windows Azure Virtual Machines went into General availability status on 16th April 2013. For any issues related to Virtual Machines, please contact Windows Azure Support
15) Status of VMs after Windows Azure Virtual Machines reaches General Availability (GA) status
Because the Windows Azure Virtual Machines feature remains in preview, requirements may still change before it reaches general availability. We expect that VMs created during preview will continue to run after GA. Note that it is possible that guest OS requirements could change that may require updating the VM to remain supported.
Update: Windows Azure Virtual Machines went into General availability status on 16th April 2013. Virtual machines created before preview continued to work after the GA.
16) VM role Vs Virtual Machines
We have seen a few customers getting confused between “VM Role”, “Virtual Machines. In fact, few of them applied for “VM Role” access, but really wanted access for Windows Azure Virtual Machines. The confusion arises because of the naming convention and state of the two features
Virtual Machines – is part of IaaS offering (Stateful, Persistent) and currently in preview . (Update: Windows Azure Virtual Machines went into General availability status on 16th April 2013.)
VM role – is part of PaaS offering (Stateless, non-persistent) and works just like web role, worker role, but with a custom supplied OS image. This feature is currently in beta.
Below articles give you the overview of both features.
Overview of the Windows Azure VM Role
In a nutshell, if you’ve applied for access using old Silverlight portal, you’ve applied for “VM Role”. If you have used account management under new HTML5 portal, you’ve applied for “Virtual Machines”
17) Disks vs Images vs VHDs
Very often Windows Azure Virtual Machines users get confused by disks, vhds, images. I wanted to provide little information here. Below descriptions contains abstracts from this MSDN article.
VHD or Virtual Hard Disk
A VHD file is stored as a page blob in Windows Azure storage and can be used for creating images, operating system disks, or data disks in Windows Azure. You can upload a VHD file to Windows Azure and manage it just as you would any other page blob. VHD files can be copied or moved and they can be deleted as long as a lease does not exist on the VHD. For more information about page blobs, see Understanding Block Blobs and Page Blobs.
Disks are logical objects. You use disks in different ways with a virtual machine in Windows Azure. An operating system disk is a VHD that you use to provide an operating system for a virtual machine. A data disk is a VHD that you attach to a virtual machine to store application data. You can create and delete disks whenever you need to.
Note : When you create a disk and map to a VHD, lease is created on the mapped VHD. As long as there is a lease on VHD, it cannot be deleted. So often see customers who delete the VMs and then directly try to delete VHDs, but run into errors indicating there is a lease on the VHD. I’ve provided more details regarding this issue above(Point#2)
An image is a VHD file that you can use as a template to create a new virtual machine. An image is a template because it doesn’t have specific settings like a configured virtual machine, such as the computer name and user account settings. You can use images from the Image Gallery to create virtual machines or you can create your own images.
Note: We often see customers trying to copy a VHD and create image based on the copied VHD. Creating an image requires you to perform sysprep on the OS. More details can be found here
18) Availability set & affinity groups, Connecting VMs – Three different, distinct purposes
Availability Set is a way to achieve high availability for your virtual machines. An availability set is a group of virtual machines that are deployed across fault domains and update domains. An availability set makes sure that your application is not affected by single points of failure, like the network switch or the power unit of a rack of servers. Guidance for managing availability using availability sets can be found here
Affinity groups are the way to group the services in your Windows Azure subscription that need to work together in order to achieve optimal performance.
When you create an affinity group, it lets Windows Azure know to keep all of the services that belong to your affinity group running at the same data center cluster. For example, if you want to keep the services running your data and your code together, you would specify the same affinity group for those services. That way, when you deploy those services, Windows Azure will locate them in a data center as close to each other as possible. This reduces latency and increases performance, while potentially lowering costs. Importance of affinity groups is described here
Connecting VMs – Load balancing
You group multiple VMs together under a single cloud service to distribute the load to multiple VMs. The way you group is during the second VM creation, you choose “Connect to Existing Virtual Machine” and then select the cloud service under which you wanted to group the VMs. This article details how to load balance virtual machines.
19) GA date?
At the time of publishing this blog there is no public announcement about the GA date. Our teams are working to launch GA at the earliest possible. Stay tuned to Windows Azure Portal for updated information.
Update: Windows Azure Virtual Machines went into General availability status on 16th April 2013.
Thanks to Abhuday Aggarwal, Avkash Chauhan, Corey Sanders, Craig Landis, Flavio Muratore for reviewing this blog post, providing valuable inputs.