XML Indexes



SQL Server 2005 (and beyond) has a native data type for XML-based data. This is a rich data type that works more like an object than just a blob of text in that you can query for information within the XML itself.

For example, the following T-SQL query might be used to retrieve the Name from an XML column called "employeeXml" within a table called "employee":

SELECT employeeXml.value('/Employee[1]/Name[1]', 'varchar(100)') as NAME 
FROM employee

What's important to note here is that in order to query for data within an XML field, that SQL Server must first parse that XML. In the case of a SELECT statement that operates against many rows of data, this parsing must take place for each row's XML field. The processing requirements of performing this operation may be negligible for a small number of rows, but what if your table had, say, thousands or even millions of rows in it?


Primary XML Index

In order to prevent the need to parse each XML field every time that it is queried, SQL Server permits you to create a Primary XML Index. A preliminary requirement is that the table must have a primary key, but otherwise, the syntax is similar to creating other indexes:

ON table_Name(xml_Column_Name)

This results in an internal system table that contains the parsed results structured in such a way that SQL Server can quickly find any node within the XML. All of the data from the source XML is stored in this internal table, so SQL Server can even reconstruct the original XML document from this shredded version.

But where does this internal system table exist? It doesn't show up in Management Studio under System Tables.

Well, this it true. This table is truly a "hands-off" table that is accessible only to SQL Server itself. However, we can see some information about it by examining the sys.internal_tables table:

SELECT * FROM sys.internal_tables 

One row of this query's results should include the XML_INDEX_NODES record containing information about the internal system table that was generated at the same time that the Primary XML Index was created. This is your proof of existence.

This table in itself is not the index, though. To see the list of XML Indexes that exist within your database, you can query the sys.xml_indexes table:

SELECT * FROM sys.xml_indexes

The "parent_minor_id" field of sys.internal_tables is related to the "index_id" field of xml_indexes. But, since there's no way for you to consume this data directly, just rest assured that XML Indexes are entirely contained within your database itself (and not within the MASTER database, etc).