Filename instead of title for zip files in search

On SPS 2010 assuming the search is configured using the default XSLT and managed properties, if you upload a .zip file to a document library  and you are using a different title than the filename you will notice how in the search results always the filename is shown rather than the title as with other file types.

It seems the 'Title' managed property is always populated with the filename despite the title being mapped by default.

In order to receive the title of the zip file in the search results rather than the filename you have to implement the following changes:

When investigating the current crawl/index behavior togheter with Senior Support Escalation Engineer Holger Lutz we could see that out-of-the-box SharePoint choses for .zip files to store the filename in the index as title
instead of the SharePoint column title:

The filename as title gets returned in the search results:

Here is an excerpt of the debug-output of our troubleshooting simplepi.dll where we see that gets identified as title and put into the index:

PID=2 Title Value(LPWSTR) Property: gather, Search= '0B63E350-9CCC-11D0-BCDB-00805FCCCE04', '' , 31, crawled_propid=107 pri=300 Flags: Index Retriev ResPri Dups MD5

In the debug-output we see that the right title gets identified but assigned to a different crawled property called EmbeddedContent:

Text:plugin, ++Gatherer= '0B63E343-9CCC-11D0-BCDB-00805FCCCE04', 'EmbeddedContent' , 31, crawled_propid=101 pri=500 Flags: Index Retriev ResPri Dups MD5 PID=2 Title Lcid 1031 ZipTitle

As a solution to get the SharePoint title column content indexed and correctly returned in the search results for .zip files a change of the crawled property <-> managed property association is required:


This is how the Title managed property is configured out-of-the-box:

You have to add the EmbeddedContent(Text) property at the following position for the Title managed property to get the title correctly returned in the search results for .zip files:

The title gets correctly returned after a full crawl / incremental crawl which picks up the zip file(s) in the search results:

Our tests show that the EmbeddedContent(Text) crawled property will be empty for other files like doc / docx / txt / xls / xlsx / …but gets the title value for .zip files.