Salvaging Data from Corrupt Database Tables

Every Relational Database Management System (RDBMS) can experience corruption.   Data corruptions very rarely occur.   But they do, and they will.    More on database corruptions and consistency can be found HERE.  Restoring from a known good clean backup is always the recommended way to move away from a corrupted database state and back to normal operations.  

 

After eliminating the source of the corruption and restoring from a good clean backup, it is sometimes wondered, “Is there any way to salvage (or rescue or recover) additional data from that corrupt database that was backed up as a contingency plan?”  i.e. Is it possible to restore that database to another system and try to recover the data from that restored copy of the database with the corruption in it?

Many times there isn’t.   But sometimes there is.   Sometimes the metadata that describes the internal structures inside of the database is itself corrupt and prevents either the identification of objects in the database or even the mere opening of the database.   This post, however, elaborates some common techniques that can be valuable when salvaging data from tables affected by physical corruption in situations where the internal metadata of the database is sufficiently intact to be able to access some of the data.

 

These actions, however, should only be pursued as part of a ‘last ditch’ data salvage or rescue option.   They are just additional steps for getting data out from corrupt tables.   Typically, such activity cannot maintain logical consistency of a database;  and thus it should never be used in place of normal RESTORE operations.  

 

As a side note on this topic:  What is the relevance of using DBCC CHECKDB in this situation?

This post is generally for situations after the use of CHECKDB.    When CHECKDB output recommends the use of REPAIR_REBUILD, this typically will resolve the situation.   If the RADL option is recommended and used (because that is the only option), then in such a case the corrupt data will typically be removed and that is one of the purposes of this post:  How to salvage other data from the backed up copies of the database, that have the corruption in them, in order to see if any additional data could be obtained to help the effort to restore Application consistency to the database.

 

 An Overview of the general approach for Data Salvage is the following:

1.       PREPARE: Create a new clean empty database TARGET where the data can be safely moved.

2.       INITIAL BASELINE: Get as much of the data from the SOURCE {corrupt database} into the TARGET {new, clean database} that can be moved easily.

3.       SCOPE WORK NEEDED: Identify problem tables

4.       SALVAGE WHAT YOU CAN: Work to move over those problematic tables

5.       QUANTIFY RESULTS: Try to identify helpful information on remaining problematic data for the Data Consumers to try to re-establish Logical Consistency

 

 And the elaboration of them:

1.        PREPARE: Create a new clean empty database TARGET where the data can be safely moved.

If at all possible, work to salvage data on a HEALTHY separate physical server instance.  

 

2.        INITIAL BASELINE: Get as much of the data from the SOURCE {corrupt database} into the TARGET {new, clean database} that can be moved easily.

Get the corrupt database onto a separate SQL Server instance.   Restore if it works (it doesn’t always).   Or sp_detach_db and sp_attach_db, if it works.  Or sometimes make an additional, identical DUMMY database with identical file sizes and locations…and replace the files.   Create a new working database to place all of the recovered data (call it what you want {e.g. DBSalvageData or whatever} I’ll refer to it as just TARGET}.

Try to move the data over from the SOURCE to the TARGET database using either a preferred 3rd party tool, SSIS or SQL Server Management Studio.  Sometimes, you might need to set the corrupt database to EMERGENCY mode in order to all the data in the database to be accessible.  Sometimes it can be an additional copy of the database and then leveraging RADL on it, but making sure to discern all objects that are  affected from that effort {e.g. rowcount comparison and such}.

Sometimes the data will not be able to be accessed at all.  

 

3.        SCOPE WORK NEEDED: Identify problem tables

It is important to identify a list of tables that had problems.   And also the amount of data affected by the problems.   Both rows of data and amounts, if possible.  This is important to be able to provide estimations for how close the effort is to completion.

 

4.        SALVAGE WHAT YOU CAN: Work to move over those problematic tables

For each of the identified problem tables, try various things to move the data out from the corrupt SOURCE and over into the clean TARGET. 

These are some of the things that can be tried for each of the corrupt tables:

A)    Determine what you can do.  i.e. which of these things work:

  1. SELECT * FROM CorruptTable
  2. SELECT COUNT(*) FROM CorruptTable
  3. SELECT * FROM sys.indexes for that CorruptTable
  4. SELECT * FROM CorruptTable using an Index Hint
  5. SELECT * FROM CorruptTable using an Index and READPAST and NOLOCK
  6. SELECT * FROM CorruptTable using an Index and READPAST and NOLOCK specifying an ORDER BY
  7. SELECT * FROM CorruptTable using an Index and READPAST and NOLOCK specifying an ORDER BY   adding DESC to the ordering columns to switch direction

B)     IF you can get to all the data by any of these means, then you can use INSERT…SELECT to get your data over into the Working database.  Typically this is just a 3 step process:

1)      Create the table in the TARGET database using  

SELECT * INTO TARGET.schema.tablename

FROM SOURCE.schema.tablename WHERE 0 = 1

to create the table definition and  columns

2)     Create the unique clustered index (if there is one) on the TARGET database to match what was in the SOURCE database

    {This is helpful avoiding duplicates during the salvage effort.  But isn’t necessary.  The duplicates can be cleaned up later too.   It's an issue of choice.}

Most application tables typically have unique clustered indexes in them.   In SQL Server, Primary Key constraints usually reside on top of unique, clustered indexes.  In SQL Server too {for those reading from other RDBMS perspectives} “clustered” indexes actually represent the physical ordering of the data. Because of this, there can only be one clustered index on a table.  And, when a clustered index exists, then all other nonclustered (or secondary or lookup) indexes will contain the clustered index keys at the end of each series of index keys.  The system procedure “sp_help” can be useful to identify existing indexes, primary key specifications, and clustered property

3)     Then use        

INSERT TARGET.schema.tablename

SELECT * FROM SOURCE.schema.tablename  -- adding whatever qualifiers are necessary to access the data {see step “A)” above}.

C)     BCP, very seldom works when SELECT * Doesn’t.   But by setting batch size = 1 and specifying the query parameter, you can get data out this way too.

D)    Create and open cursors on the a new clean image of the CorruptTable when you can get to all of the Primary Key values of it and step through each record and move the data over 1 row at a time.                 
You can use this with option “A).” and then by altering ORDER BY direction and row start specification to get most ‘chunks’ of data in the table out.

This option Presumes you can get the table structure and Primary Keys built and created in the TARGET

1)     Work to get all the Primary Key columns over and into the TARGET database for each affected table that is still problematic.

        Generally this can be done through any nonclustered index, via a hint  {remembering when a clustered index exists, all clustered keys will be at end of index},

2)     Create a CURSOR to step through 1 row at a time and pull all the data over for those Primary Key columns.   Experimenting, as in “A)” with predicates and hints 

        Sometimes needing to specify a start row AFTER the last row failure.  i.e. Getting one chunck, up until a corruption.   And then getting another chunk, starting at the next row.

3)     SQL Server {tSQL} syntax on Declaring and working with CURSORs can be found in SQL Server Books Online.  See the Examples at the bottom of the DECLARE CURSOR topic

e.g.                

DECLARE the cursor

OPEN it

FETCH NEXT from it

WHILE @@FETCH_STATUS = 0

BEGIN

    Go and INSERT one row

     FETCH NEXT from it

END

CLOSE the cursor

DEALLOCATE it

5. QUANTIFY RESULTS: Try to identify helpful information on any remaining problematic data for the Data Consumers to try to re-establish Logical Consistency

It's always valuable for all parties, at the end of a data salvage effort to provide as complete a statement as possible on what data or tables of data were not able to be salvaged.  This, of course, is not always possible.

 

 

If none of these work, then at least see if you should try to get to the Primary Key columns (if a Primary Key exists).  For with Primary Keys of a table, most owners of Application Consistency can usually identify which records have NOT been recovered.  And this can be critical information for end consumers of any lost and no longer available data.