Document Parser Processing

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

When you upload, move, or copy a file to a document library, Windows SharePoint Services determines if a parser is associated with the document's file type. If one is, Windows SharePoint Services invokes the parser, passing it the document to be parsed and a property bag object. The parser extracts all the properties and matching property values from the document, and adds them to the property bag object.

Windows SharePoint Services accesses the document property bag and determines which properties match the columns for the document. It then promotes those properties, or writes the document property value to the matching document library column. Windows SharePoint Services only promotes the properties that match columns that apply to the document. The columns that apply to a document are specified by the following:

  • The document's content type, if one is assigned.

  • The columns in the document library, if the document does not have a content type.

For more information about content types, see Content Types.

Windows SharePoint Services also stores the entire document property collection in a hash table, which can be accessed programmatically by using the SPFile.Properties properties. You cannot access the document properties hash table through a user interface.

The following figure shows the document parsing process. In it, the parser extracts document properties from the document and writes them to the property bag. Of the four document properties, three are included in the document's content type. Windows SharePoint Services promotes these properties to the document library; that is, it writes their property values to the appropriate columns. Because the document's content type does not include the Status column, Windows SharePoint Services does not promote the fourth document property, Status, even though the document library includes a matching column. Windows SharePoint Services also writes all four document properties to a hash table that is stored with the document on the document library.

Windows SharePoint Services can also invoke the parser to demote properties, or write a column value into the matching property in the document itself. When Windows SharePoint Services invokes the demotion function of the parser, it again passes the parser the document and a property bag object. In this case, the property bag object contains the properties that Windows SharePoint Services expects the parser to demote into the document. The parser demotes the specified properties, and Windows SharePoint Services saves the updated document back to the document library.

The following figure shows the document property demotion process. To update two document properties, Windows SharePoint Services invokes the parser, passing it the document to be updated, and a property bag object containing the two document properties. The parser reads the property values from the property bag and updates the properties in the document. When the parser finishes updating the document, it passes a parameter to Windows SharePoint Services that indicates that it has changed the document. Windows SharePoint Services then saves the updated document to the document library.

See Also

Concepts

Custom Document Parsers

Mapping Document Properties to Columns

Document Parsing and Content Types