Extracting Microsoft Office Application Properties without automation

Every file created by a Microsoft Office application supports a set of built-in document properties. In addition, you can add your own custom properties to an Office document either manually or through code. You can use document properties to create, maintain, and track information about an Office document such as when it was created, who the author is, where it is stored, and so on. To get or set the properties you can use automation to extract the Microsoft Office application properties.

Take a look at the following links for samples:

http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q303296&

http://msdn2.microsoft.com/en-us/library/4e0tda25.aspx

But what happens if you are working with a Web-based application and you want to avoid the use of automation in a Web server…

I found a nice workaround to extract Office document properties without using automation. You can use the Dsofile, an in-process ActiveX component that allows you to read and to edit the OLE document properties that are associated with Microsoft Office files, such as the following:
• Microsoft Excel workbooks
• Microsoft PowerPoint presentations
• Microsoft Word documents
• Microsoft Project projects
• Microsoft Visio drawings
• Other files without those Office products installed

If you are working with a managed application follow the next steps:

  1. Download and install the DSO File control.

  2. Add a reference to InteropDSOfile.dll to your managed Web application.

  3. Create a new Web form and copy the following code.
    <%@ Page Language="C#" %>

    <script runat="server">
        protected void btnLoadFile_Click(object sender, EventArgs e)
    {
            // Define a path to save the file in the server
            string serverTempFilePath = Server.MapPath(@"/yourpath/" + FileUpload1.FileName);
            FileUpload1.PostedFile.SaveAs(serverTempFilePath);

            // Create the DSOFile document
            DSOFile.OleDocumentPropertiesClass oleDocument = new DSOFile.OleDocumentPropertiesClass();
            DSOFile.SummaryProperties summaryProperties;

            oleDocument.Open(serverTempFilePath,
                    true,
    DSOFile.dsoFileOpenOptions.dsoOptionOpenReadOnlyIfNoWriteAccess);

            // Extract the properties
            summaryProperties = oleDocument.SummaryProperties;
            tbTitle.Text = summaryProperties.Title;
            tbAuthors.Text = summaryProperties.Author;
            tbCompany.Text = summaryProperties.Company;
            tbNumPages.Text = summaryProperties.PageCount.ToString();
            tbWordCount.Text = summaryProperties.WordCount.ToString();

            // Close the DSOFile.OleDocumentPropertiesClass
            oleDocument.Close(false);
        }
    </script>

    <html xmlns="http://www.w3.org/1999/xhtml">
    <head runat="server">
        <title>DSOFileDemo</title>
    </head>
    <body>
        <form id="form1" runat="server">
            <div>
                <strong>
    DSOFileDemo</strong><br />
    <br />
    <table border="1">
                    <tr>
                        <td valign="top">
    File upload:</td>
                        <td>
                            <asp:FileUpload ID="FileUpload1" runat="server" />
    <asp:Button ID="btnLoadFile" runat="server" OnClick="btnLoadFile_Click" Text="Load File Properties" /><br />
    </td>
                    </tr>
                    <tr>
                        <td>
    Title:</td>
                        <td>
                            <asp:TextBox ID="tbTitle" runat="server"></asp:TextBox> 
                        </td>
                    </tr>
                    <tr>
                        <td>
    Author:</td>
                        <td>
                            <asp:TextBox ID="tbAuthors" runat="server"></asp:TextBox> 
                        </td>
                    </tr>
                    <tr>
                        <td>
    Company:</td>
                        <td>
                            <asp:TextBox ID="tbCompany" runat="server"></asp:TextBox> 
                        </td>
                    </tr>
                    <tr>
                        <td>
    Number of Pages:</td>
                        <td>
                            <asp:TextBox ID="tbNumPages" runat="server"></asp:TextBox></td>
                    </tr>
                    <tr>
                        <td>
    Word count:</td>
                        <td>
                            <asp:TextBox ID="tbWordCount" runat="server"></asp:TextBox> 
                        </td>
                    </tr>
                </table>
            </div>
        </form>
    </body>
    </html>

  4. If you run the previous Web form you will get something like this:

You can also extract custom properties using the DSOFile control.

Have a peek and enjoy!