Understanding SharePoint - Part 4 - Lock those sites before performing backups to prevent orphaned entries!

Understanding SharePoint - Part 4 - Lock those sites before performing backups to prevent orphaned entries!

This is the fourth post in a series in regards to "Not being mislead by what your seeing :)"

DISCLAIMER: This post may include steps that guide you through issuing statements against your SharePoint content databases. By no means does this mean that you should change any thing in the database. This is simply for "READING" values, and even this should be done during Off-Peak hours.

Isn't this really Orphaned Sites - Part 4?

I almost made this part 4 of my "Orphaned Sites" series, but decided not to, because I want to stress the fact of the importance to read/write locking your sites before performing ANY site collection level type backups.

The Scenario

For whatever reason, you need to restore a specific site collection, or web, etc. It could be that important information was deleted, or you're in full disaster recovery mode.

You kick off a STSADM -o restore of the site collection and you even include -overwrite switch, and it fails with an error message that indicates:

    Exception from HRESULT: 0x80040E2F

This HRESULT simply means, "Violation of PRIMARY KEY constraint ", which generally means one thing :) You have entries in the database for this site previously, possibly orphaned, that is preventing the restore to continue, or...what this post is about :)

But I used the overwrite switch, should that not have taken care of cleaning up the site first?

You would think, since all the overwrite switch does is force a delete of the site collection on the database before performing the restore. It's the same process if you were to do the following manual steps:

(Of course, if you have orphans on the site :) It may be difficult to actually delete the site, thus refer to my "Orphaned Sites" series, :))

But the problem isn't that there was orphaned data to begin with, but rather "New" data related to the site added outside of the restore, that then causes the restore to fail.

The Reason

The reason in this case, may be due to the fact that while the restore is progressing, the site isn't locked and therefore users accessing the site is causing the duplicate entries.

For the most part, everything stored along with a site has a unique GUID, so this isn't "Duplicate Named" documents or duplicate list items etc causing this, as those would have a unique GUID associated with their entries. Instead, what it COULD be, is entries in the UserInfo table being created for the site collection by means of user visits, and thus, when they restore gets to the point where it starts attempting to insert it's copy of that same UserInfo data, the exception occurs.

How this can occur, is actually quite simple. 

  1. When we detect a user hitting a site collection for the first time, we cache information in the UserInfo table for that user specifically associated with that site collection (See http://blogs.msdn.com/krichie/archive/2006/02/18/534752.aspx for a a more technical description of this) .

  2. During a restore, we may not be inserting the Site Collection level UserData information yet,

  3. Enough information however, IS currently restored (Such as a webs ACL, containing the ACL's of Users and Groups), to allow the authentication to the webs.

  4. The Web was not locked during backup, therefore the lock state of the site is not set during the restore, thus users are able to access the webs in the site.

  5. Because we see no information currently in the UserInfo table for the user, a record is created for the user.

  6. The restore progresses along, and the backed up UserInfo data is inserted into the table, yet because there is already an entry for a user it's trying to restore in the database, the exception is thrown and the restore fails.

If the site collection was read/write locked before the backup was performed, you would not run into this problem, because the lock state of the site would prevent any access to it until the restore was complete, in which you would then go and unlock the site.

Read/Write Locking sites is actually a recommended best practice from the product group if you are going to run site collection level backups of your data on production sites If this is true, they why doesn't STSADM automatically lock the site for you?

Please refer to the official Backing Up and Restoring Web Sites - http://office.microsoft.com/en-us/assistance/HA011608261033.aspx for an explanation :) But I'll call out the specific reasons why here:

Site backup and restore are not designed to be used when the server is under active load.

If a site is in use when the backup operation is run, the data in that site may continue to change throughout the operation. The resulting backup file may be inconsistent with the actual state of the site and, if you restore this file, the restored site or database will be inconsistent as well. To avoid possible inconsistencies, lock the site collection prior to backing up a site in an active farm, using the "No Access" lock. After the backup is complete, set the lock back to "Not locked". For information about locking and unlocking a site collection, see the Managing Locks section of the Configuring Site Collection Quotas and Locks topic.

In Conclusion

It's extremely important to lock your sites before backing them up using STSADM. You could encounter problems later due to the inconsistencies that can be introduced as noted in the aforementioned official documentation, but not limited to just what I'm noting in this post.

If you use SPSiteManager from the SharePoint Utility Suite, it automatically sets the site to the No Access state before performing a backup operation. You can then use the additional -o unlock site operation to unlock your restores after complete. The SPSiteManger repartition operation performs the lock, and unlock automatically when moving a site.

Hope this helps!!

- Keith

Previous Posts on this series:

Understanding SharePoint - Part 1 - Understanding the SharePoint Portal Server Indexer

Understanding SharePoint - Part 2 - The Infamous Query Plan Bug and The Origins of SPSiteManager

Understanding SharePoint - Part 3 - Just because your database is larger doesn't necessarily mean that your index size will be

Additional reference material:

Backing Up and Restoring Web Sites

SPSiteManager is contained in the The SharePoint Utility Suite at: