How Full Text Search and IFilters Works in SQL SERVER

How Full Text Search and IFilters Works in SQL SERVER

Full-Text Search allows us to run full-text queries against text data in tables. We need to create a full-text index on the table before running full-text queries on a table. The full-text index contains single or multiple columns (character-based) in the table. The data types of columns can be: nchar, nvarchar, char, varchar, text, ntext, xml, image or varbinary (max) and FILESTREAM.  Each column can use a specific language.

Full-text queries use linguistic searches against text data by operating on words and phrases that are based on rules of a particular language such as English. It can contain simple words and phrases or multiple forms of a word or phrase. It returns any documents that contain at least one match (also known as a hit).

What Full-Text Search Does?

Full-text search can be used in a wide range of scenarios such as searching for items on web, searching for case histories in a legal-data repository or matching job descriptions with stored resumes. In a given business scenario, full-text index and queries can be used to meet business goals.

Full-Text Search Queries

We can run full-text queries on the text columns after columns have been added to a full-text index. These queries can be used to search for any of the following:

  • Single or multiple specific words or phrases
  • A phrase or a word where the word begins with specified text
  • A phrase or a word close to another phrase or word
  • Synonymous forms of a specific word

These queries are not case-sensitive. For instance, searching for “Vijay” or “vijay” returns the same results.

A small set of Transact-SQL predicates (CONTAINS and FREETEXT) and functions (CONTAINSTABLE and FREETEXTTABLE) are used in these queries.

Example:

Scenario 1—searching for an employee who has experience working with MVC:

SELECT employee_name, ecn
FROM Employees
WHERE FREETEXT (employee_resume,”MVC”) AND employee_division =’ITS’;

Scenario 2 — searching for a product on web:

SELECT product_code
FROM products
WHERE CONTAINS (product_desc,”Shoes Castle” OR FORMSOF (THESAURUS, ’Shoes’) OR ‘Castle’)
AND product_price< 2000;

Filters (iFilters).

Indexing a document in an image, varbinary (max) or xml data type columns require filters to perform extra processing. The filters should be specific to the document types (.doc, .pdf, .xlsx, .xml, and so forth)

When indexing data stored in an image or varbinary(max) column, the filter, which implements the IFilter interface and extracts text based on the file format specified for that data (for example, MS Word). In some cases, the filter components require the image data or varbinary (max) data to be written out to the filter data folder, instead of being pushed into memory.

How to Install Full Text Search and Filters(iFilters)

The Full-Text Search is an optional component the Database Engine and this is not installed by default. During SQL Server Installation, we need to select the Full-Text Search feature as follows:

2

Once this feature is successfully installed on the SQL server we can see SQL Full-text Filter Daemon Launcher service in the SQL Server Configuration Manager. We need to start this service before using Full-Text Search otherwise the queries will fail.

Configure Full-Text Search

Configure Full Text Search on a particular table in a database as follows:
Once we are connected to the SQL Server expand our database node then Storage.

2

Right click on full-text-catalog then select New Full-text-catalog and this open a popup as below

2

In that popup screen, we have to mention the name as My_DATABASE_NAME_FullTextCatalog. Next we have to specify DBO as the owner for that catalog and then save.

Once we have successfully configured a Full-Text Catalog, next step is now to configure a Full-Text Index. For example we will create a Full-Text Index on the File table that has the Columns as shown in below image.

2

Now go to table design view and right click on a column and select full-text-index. Now a popup will open in which we are to select Full Text catalog name and column name and then close and save the table.

If there is already any data in table we are to rebuild the full-text-index. Select the table and expand with the indexes > Expand the Indexes > Right click on the index > select Rebuild

Install iFilters:
To search binary files we need IFilters, here we install for .pdf and MS Office files.

For PDF files
PDF IFilter 9 for 64-bit platforms:
http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
PDF IFilter v6.0 32-bit platforms:
http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

For MS Office files
MS Office 2010 Filter Packs
http://www.microsoft.com/download/en/details.aspx?id=17062

Download the filters and install, after installing execute the bellow lines on your SQL Server:

Exec sp_fulltext_service ‘load_os_resources’, 1;
Exec sp_fulltext_service ‘verify_signature’, 0;

Restart the SQL server and if IFilters are installed (.pdf, .docx for example) or not by following query Exec sp_help_fulltext_system_components ‘filter’;

This will show you a table in which a column should contain .docx and .pdf extensions. Our Database is now ready to search in binary files.

Tags:
  • Full Text Search,
  • Full Text Search in SQL Server,
  • IFilters,
  • IFilters in SQL Server,
  • SQL Server,
  • Leave a Reply