Plan for document conversion in Office 2010
Applies to: Office 2010
Topic Last Modified: 2011-08-05
Because your business runs on the content that it creates, edits, and consumes every day, you must understand the potential effect of a Microsoft Office 2010 migration on this content. An important question to consider is whether to convert some or all binary Office files to the Open XML format. In many cases, an organization can limit a conversion project to business-critical files and use compatibility mode for the remaining files. Conversely, organizations that have document management processes and retention, auditing, and classification policies are likely have business needs that require a wide-scale conversion. Organizations that use 2007 Office system and that have already standardized on the Open XML file format do not have to convert or remediate existing Open XML files for use with Office 2010.
In this article:
Overview of file conversion in Office 2010
Is conversion right for your organization?
Is OMPM the appropriate tool for your organization?
Planning and conversion guidance for OMPM
Overview of file conversion in Office 2010
When you plan a migration to Office 2010, you must consider whether to convert some or all your organization’s binary Office files, such as .doc, .ppt, and .xls files, to the Open XML format. Open XML format, which was introduced in the 2007 Office system, is an XML-based file format that provides benefits such as smaller file size and improved information security, compared to binary files. Standardizing on Office 2010 and the Open XML format helps eliminate compatibility issues that can occur when users work in different file formats and in different versions of Office. However, most files in binary format open as expected in Office 2010 when users use compatibility mode. Therefore, not every file must be converted. If compatibility mode meets your business needs for most files, you can reduce the time and effort that is required for conversion by focusing on business-critical files and those that are identified as containing potential compatibility issues.
After evaluating the choices of performing a full file conversion or using compatibility mode, many organizations determine that using the compatibility features in Office 2010 is sufficient for their needs and they skip the bulk conversion process completely. As the migration progresses, Office 2010 users in these organizations have the option of saving new files in Open XML format and converting binary files while they edit them. Users who have not yet migrated to Office 2010 can still edit the Open XML files by using the Compatibility Pack. By using this “convert as you go” method, the most-used files are eventually converted to Open XML by users, and older, unused files remain in binary format.
However, maintaining both binary and Open XML files is not the best solution for all organizations. Some organizations should strongly consider conversion, especially if they are migrating to Office 2010 specifically for new features. This is particularly true for organizations that exhibit the following characteristics:
They use document management products and understand the location and kinds of files that are managed by those products.
They manage documents by using retention, compliance, Information Rights Management (IRM), or auditing policies.
They rely on business-critical documents that would be better managed by having the IT department proactively convert the files.
They used the smaller file size of Open XML files (compared to binary files) as a business justification for migrating to Office 2010.
The conversion process includes identifying and assessing the files to be converted, the conversion itself, and remediating formatting or other issues in converted documents. Microsoft provides the Office 2010 Migration Planning Manager (OMPM) to help you with the assessment and conversion process. You can use the tools in OMPM to scan Office 97 through Office 2003 files for conversion issues, create reports to help you analyze the scan data, store the scan data, and (optionally) to convert binary Office files to the Open XML file formats. This article contains guidance on evaluating OMPM and using it in large environments. You can find more information about OMPM in Office Migration Planning Manager overview for Office 2010, and you can download OMPM from the Microsoft Download Center (http://go.microsoft.com/fwlink/p/?LinkId=204300).
Is conversion right for your organization?
Some organizations have hundreds of thousands, or even millions, of Office documents in binary format. Strong business requirements for conversion must exist before you commit to a conversion project of this size. In addition, you must have a good understanding of where files are stored and the formats in which they are stored. If the business requirements and scope of the project are unclear, you should first determine whether working with compatibility mode meets your business needs for some or all your organization’s files. The following resources can help you decide:
Read the article Overview of the XML file formats in Office 2010 to learn more about the benefits of the Open XML formats.
Read the article Plan for using compatibility mode in Office 2010 to review the considerations for working in compatibly mode and decide whether the trade-offs are worth the time that you save by skipping the conversion.
The timing and length of your Office 2010 migration project can also affect the business value that you gain by converting files. You gain little value from converting binary files to Open XML during a long migration if only a few users run Office 2010. A long migration might also require multiple conversions because some users of earlier versions of Office continue to create files in binary format. If your business needs dictate a conversion, we recommend that you identify the users who share documents and migrate them as closely together as possible. Users of Office 2003 and previous versions must use the Compatibility Pack to view and edit Open XML files. To review system requirements and download the Compatibility Pack, visit the Microsoft Download Center (http://go.microsoft.com/fwlink/p/?LinkId=204348).
Is OMPM the appropriate tool for your organization?
If you decide to convert binary Office files to Open XML format, you must decide which tool to use for assessment and conversion. Microsoft provides the Office Migration Planning Manager (OMPM) for Office 2010 to help you assess and perform bulk conversions of binary files to the Open XML format. Third-party solutions are also available. The following information about OMPM can help you decide whether OMPM is the appropriate tool for your organization.
OMPM includes the following components:
File scanner The OMPM File Scanner (Offscan.exe) is a command-line tool that scans files for conversion issues on clients, file servers, and document repositories that can be accessed through WebDAV. OMPM provides two kinds of scans:
A deep scan that you can perform on Office documents to crawl document properties that provide indicators of potential conversion issues. This is the default scan.
A light scan that quickly identifies the Office documents on a user’s computer or network file system.
Database support OMPM provides a set of batch files that you can use to import the XML log files that are generated by the OMPM File Scanner. The databases that OMPM supports are described in Administrative computer requirements.
Reporting OMPM provides a Microsoft Access 2010–based reporting solution that produces various reports for your analysis and enables you to define file sets for automated processing.
Bulk conversion OMPM includes the Office File Converter (OFC), a command-line tool that does a bulk conversion of specific files to the Office 2010 file formats.
Version extraction OMPM includes a Version Extraction Tool (VET) tool, a command-line tool that extracts multiple saved versions of a document in Word 97–2003 format to individual version files in Office 2010.
Enhancements for OMPM 2010
OMPM is updated for Office 2010 with the following improvements:
Bulk macro compatibility scanning This feature incorporates logic from Office Code Compatibility Inspector (OCCI) tool to produce a count of the potential number of VBA issues caused by changes in the object model. A new option in the offscan.ini file enables this scan.
Bulk 64-bit compatibility scanning This feature incorporates logic from OCCI to produce a count of potential number of 64-bit (declare) VBA issues due to use of 64-bit Office. A new option in the offscan.ini file enables this scan.
Date filtering for scans This feature lets you exclude the scanning of files that have not been accessed or modified within a specified period of time..
SQL Server 2008 and SQL Server 2008 R2 support For additional requirements for using SQL Server 2008 R2, see Administrative computer requirements.
Considerations for choosing OMPM
If you are considering using OMPM, review the following details about OMPM to help you decide whether to use this tool or invest in third-party conversion tools.
Files that OMPM scans
OMPM can scan for files that are created by the following Microsoft Office applications:
For a complete list of file types that are scanned for these applications, see the section “Files scanned by the OMPM File Scanner” in Set up the Office Migration Planning Manager File Scanner for Office 2010.
Files that OMPM does not scan
OMPM does not scan .pdf files or files that are created by the following Microsoft Office applications:
OMPM does not scan documents that are password-protected or Information Rights Management (IRM)-protected. In addition, the OMPM File Scanner does not scan embedded objects within documents. However, it does report that the document contains embedded objects.
Reporting and remediation considerations
For each file scanned, OMPM reports a status of No Issue, Green, Yellow, or Red. Documents flagged as No Issue are better candidates for conversion. Documents that have a Green status are also candidates for conversion because they will contain only minor, cosmetic conversion issues. Documents that have Yellow or Red status contain known compatibility issues that will cause legacy objects or information not to convert at all or as expected.
OMPM scan results indicate whether binary Office files will have issues if they are converted to Open XML file format. For each file scanned, OMPM shows the issue type and a brief description of the issue that can occur if the file is converted. OMPM does not provide any specific information about potential compatibility issues if Office files are left in their binary format (97-2003 or older) and are accessed by using compatibility mode.
OMPM does not make any changes to files to fix compatibility issues. In fact, OMPM does not scan for bugs. Instead, it identifies known compatibility issues that can occur when a file is converted to the Open XML format, based on application features that were significantly changed or removed.
When scanning a file for macros, OMPM scans the file against all Office programs to determine which programs might have compatibility issues with the file. This can cause an unexpectedly large number of issues in the Functionality Issue Count column of the Macro Summary tab. Although the count is inflated, you can still use these results to identify the files that will have the most issues after conversion. For a more accurate count of macro issues for a given file, use the Office Code Compatibility Inspector, which is available on the Compatibility for Microsoft Office 2010 page on TechNet.
We recommend running a test conversion of sample files before you conduct a bulk conversion. A user will have to inspect the samples one by one and compare them to the original to determine whether the conversion resulted in any appearance or other issues. You must also plan for additional remediation to fix broken macros and links in converted files because OMPM does not fix these issues as part of the conversion. If you must have automatic remediation as part of the bulk conversion process, we recommend that you evaluate third-party conversion solutions.
Planning guidance for using OMPM in large environments
If you are an administrator who plans to use OMPM in a large environment, the following guidelines will be helpful.
Identify and prioritize files to scan using OMPM
Work with the business group owners to determine which clients, servers, and document repositories to scan by using OMPM. Business groups should also assign priority or importance to each set of files so that you can decide the order in which to scan them. Low-priority file sets can be scanned last, as time allows, or not at all. You can also communicate the scan process to users and ask them to identify which of their files are critical for conversion. By asking the users directly, you make them aware of the upcoming conversion and reduce the number of files to be scanned.
You might find it easier to scan all files and then use the OMPM results to better prioritize files. A common usage of the OMPM is to assess quantity, location, potential Office file conversion issues, and the last modified date for files. The last modified date property of Office files can be especially useful to decision makers because they might find that most Office files have not been accessed in a very long time. Therefore, there might not be a business reason to keep the files. If that is the case, you can clean up network shares by archiving out-of-date Office files to free significant disk space.
Prepare users for working with converted files
Microsoft provides several resources that help end-users learn the new user interface and features in Office 2010. Use the following resources to prepare end-users:
Training resources The User Readiness and Training Resource Center (http://go.microsoft.com/fwlink/p/?LinkID=202000) on TechNet contains an up-to-date list of training resources, such as interactive guides, Office Ribbon guides, e-learning courses, and more.
How-to articles The topic Document compatibility reference for Excel 2010, PowerPoint 2010, and Word 2010 in the Office 2010 Resource Kit provides links to articles that are written for end-users of Excel 2010, PowerPoint 2010, and Word 2010. The topics that are covered include the following:
Enabling compatibility mode
Converting files to Office 2010 format (exiting compatibility mode)
Running Compatibility Checker
Features that change when you open an Office 2010 file in an earlier version of Office
Tips for scanning large quantities of files by using OMPM
The article Office Migration Planning Manager (OMPM) for Office 2010 provides a starting point for you to begin the actual process of file scanning. If you plan to scan millions of files, use the following guidance to help simplify the scanning process.
Clean up files old files before you scan them. This is your opportunity to clean up your files shares and document sprawl.
Do not scan files that have not been modified in the most recent 6 to 12 months. OMPM 2010 supports the exclusion of files that have not been accessed or modified during a specified time. Work with business groups to determine whether scanning old files is necessary. If not, exclude those files from scanning and conversion. After the migration to Office 2010 is complete, create a plan for removal or back up of the documents that are in old file formats.
Focus on business-critical files, especially those that are owned by groups that create complex documents. Work with your business groups to determine which documents are considered business-critical or that must be converted for legal or compliance reasons. If applicable, use document complexity as your basis for prioritizing file scanning. For example, specific business groups, such as Finance, typically have more complex documents that require careful review and testing.
Scan shared folders first. The files in shared folders are more centralized, and the results can be a good sample that you can use to determine whether additional scans are needed. For example, if the files in the shared folder are more than a year old, you might decide not to convert those files.
Run multiple instances of offscan.exe. We recommend that you run multiple instances of the scanner (to scale out) by using a unique RunID (as set in the offscan.ini file) per scan. We advise against setting up a single instance of offscan.exe to run against a large number of files (tens of thousands).
Scan by folder structure. Use the folder structure as the logical unit of files to scan by. For example, if a shared folder has a top level folder and subfolders, run individual scans against the subfolders instead of trying to run the scanner against the whole shared folder.
Run offscan.exe on multiple clients. To scale out the number of concurrent scans, run offscan.exe on multiple clients. This is a good approach because it is likely that no one IT administrator has permissions to all shared folder locations. By using this approach, you can scan millions of Office files in several days.
Do not scale up by using servers to run OMPM. The design of Offscan.exe does not support scaling up, such as using more powerful hardware.
Use System Center Configuration Manager (SCCM) to scan clients. The OMPM user guide provides more details about how to use SCCM to deploy offscan.exe, run the scan, and then collect the scan log xml file. For more information, see Run the Office Migration Planning Manager File Scanner for Office 2010.
Tips for dealing with millions of OMPM scan results
Although the initial OMPM assessment might discover millions of files, our OMPM customers tell us that typically only 1-2% of those files are flagged as having potential conversion issues. Here are some tips for making the results more manageable. You can find additional information in the topic Analyze Office Migration Planning Manager Reports for Office 2010
Ensure that your project plan allows sufficient time for analyzing OMPM results. When reviewing the results, understand that the OMPM report is a “worst-case scenario” of potential issues, but most warnings are not serious enough to prevent conversion.
The Access-based reporting tool only can display databases of one million records or fewer. If you have more files than that, we recommend the following workarounds (shown in order of preference):
Break up the scan into several smaller scans.
Import OMPM scan data into a single large database and query it directly instead of using the Access-based report.
Run one large scan but import subsets of the files into separate databases.
The reports do not determine duplicates. Therefore, attempt to eliminate duplicate files and determine the file of record or master file.
Analyze large Excel files and old Access files. Can these files use the newer Excel and Access Services capabilities? Should these files be migrated to SQL Server?
Pay additional attention to files that have links in them (and the child files, too). Links to other documents can break after the files are converted because the file name extensions change as part of the conversion process.
If your organization uses SharePoint Server, consider working with a third-party vendor to help you migrate your business-critical files to a SharePoint site. The vendor can also help you train users to start sharing documents in SharePoint sites instead of sending them in email messages.
Office Migration Planning Manager (OMPM) for Office 2010
Microsoft Office Code Compatibility Inspector user's guide
Overview of the XML file formats in Office 2010
File format reference for Office 2010