question

Sunil-9097 avatar image
0 Votes"
Sunil-9097 asked ·

To be able to search for text within PDF files

HI,

I have an ASP .Net Web application and I am working on implementing functionality to search for text within PDF files. Database is SQL Server. I tried out the "Adobe PDF iFilter" which appears to be free. My web application is installed on multiple servers. Does Microsoft recommend the Adobe PDF iFilter or are there any other options?

Thank you.

azure-webapps
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

markxa avatar image
0 Votes"
markxa answered ·

If your server runs Windows, it is technically possible to use the iFilter to extract text from the PDFs, set up a SQL Server full text index and use that for simple queries. In terms of what's "recommended" though, take a look at Azure Cognitive Search, which is much more powerful and flexible and will, amongst other things, OCR the text in scanned documents and handle most document types, not just PDFs. Not free, but will save a lot of development time!


· 3 ·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Just a FYI:

I configured a test blob and from what I can see its going to be AUD$12.00 per day.. AUD$360 per month..
It would be cheaper to spin up a Server VM with a SQL DB and use that with the Adobe PDF iFilter on a varbinary (max) BLOB column..

I am raising a separate question on this, in case there is a cheaper way..

0 Votes 0 ·

Thank you for the response.

0 Votes 0 ·