ASHIS KUMAR DAS: August 2011

1. Turn on the execution plan, and statistics

The first thing you need to do is to use the tools that help you determine whether a query done one way is better than another. That’s what we’re trying to do. By comparing the original query to a new query that we come up with is the best way to evaluate the benefits of any changes.

To do this, go into Sql Server Management Studio and select the Query menu. Select the “Include Actual Query Plan.” This turns on the graphical Execution Plan when you execute a query, and that can be found in the bottom pane after the execution.

In the Execution Plan, you can mouse over the components of the plan and it provides tooltip information boxes. The box contains Estimated Subtree Cost, which can be used to help determine whether one query is better than another. Of course, it’s not always right, as some query parts are not included in the execution plan, but it helps. It is also helpful to know the estimated number of rows, which is also found in this tooltip box.

Next, turn on statistics. Type the following statement:

SET STATISTICS IO ON;

This causes statistics to be output to the Messages tab in the bottom pane. The information you want here is mainly logical reads and physical reads. Logical reads are page reads from memory. Physical reads are page reads from disk. This stat can be a little deceptive as it doesn’t include CPU in the metric, but in general, the less page reads, the less work done, and so the more performant the query will be.

To counteract the above two you should also compare the actual execution times. To do this, eecute the following statement:

SET STATISTICS TIME ON;

This also has issues, as blocking and contention issues affect the output time. You should execute the query a few times to determine how accurate the time shown is.

2. Use Clustered Indexes

Having the clustered index on the primary key is sometimes not the most efficient place for the clustered index to be. A clustered index is the most performant type of index. The whole table is sorted according to the clustered index. If the table is involved in lots of joins based on the primary key, it is probably the right place for it to be, but if you are continually filtering or grouping on other columns in a table, then you should possibly consider changing the primary key index to Non-Clustered, and putting the clustered index on those filtered or grouped columns.

The following statement removes and existing clustered index on the primary key and replaces it with a non-clustered index:

ALTER TABLE MySchema.SalesOrderHeader

DROP CONSTRAINT PK_SalesOrderHeader

ALTER TABLE MySchema.SalesOrderHeader

ADD CONSTRAINT PK_SalesOrderHeader

PRIMARY KEY NONCLUSTERED(SalesOrderID);

Then the following statement adds a new clustered index to a table.

CREATE CLUSTERED INDEX MyClusteredIndex

ON MySchema.SalesOrderHeader (OrderID)

3. Use Indexed Views

Indexed Views have been around for a while. A view is like a named query, and these days you can add indexes to them. If used correctly, they can cause a massive improvement in execution times, often better than a clustered index with covering columns on the original table. Also, in SQL Server Developer Edition and Enterprise Edition, a view index will also be automatically used if it is the best index even if you don’t actually specify the view in your query!

CREATE VIEW MySchema.SalesByCustomer

WITH SCHEMABINDING

SELECT soh.SalesTerritoryID, soh.CustomerID,

SUM(sod.Quantity * sod.UnitPrice)

FROM MySchema.SalesOrderHeader soh

INNER JOIN MySchema.SalesOrderDetail sod

ON (soh.SalesOrderID = sod.SalesOrderID)

GROUP BY soh.SalesOrderTerritory, soh.CustomerID

Note the use of the schema binding attribute. This prevents you from changing underlying tables while this view exists, and is necessary if you want to add an index. Some people avoid indexed views for this reason, as the maintenance becomes more complicated as further dependencies to the view are created. The following statement adds an index:

CREATE UNIQUE CLUSTERED INDEX IdxSalesOrderView

ON MySchema.SalesByCustomer(

SalesTerritoryID, CustomerID

)

4. Use Covering Indexes

Covering indexes are a feature that was newly added to SQL 2005. Basically, you can create an index optimised for the query itself based on joins, filters and grouping, and then add additional columns that can be retrieved directly from the index for use in select statements, as follows:

CREATE NONCLUSTERED INDEX TestIndex

ON MySchema.SalesOrderDetail(OrderId)

INCLUDE (Quantity, UnitPrice)

The above statement causes a non-clustered index to be created on the SalesOrderDetail table. If queries are executed on the OrderId column, the index will be used, and if the only other columns being retrieved are Quantity and UnitPrice, then the query optimiser doesn’t need to retrieve any extra columns from the underlying table. It can just use the index. Because the query optimiser doesn’t need to query the original table, performance is improved.

5. Keep your clustered index small.

One thing you need to consider when determining where to put your clustered index is how big the key for that index will be. The problem here is that the key to the clustered index is also used as the key for every non-clustered index in the table. So if you have a large clustered index on a table with a decent number of rows, the size could blow out significantly. In the case where there is no clustered index on a table, this could be just as bad, because it will use the row pointer, which is 8 bytes per row.

6. Avoid cursors

A bit of a no-brainer. Cursors are less performant because every FETCH statement executed is equivalent to another SELECT statement execution that returns a single row. The optimiser can’t optimise a CURSOR statement, instead optimising the queries within each execution of the cursor loop, which is undesireable. Given that most CURSOR statements can be re-written using set logic, they should generally be avoided.

7. Archive old data

Another no-brainer, so I won’t say much. If you want to improve query performance, give the optimiser less work to do. If you can cut down the number of rows the query has deal with, then performace will improve. I have no problem with people creating audit triggers to move historical data into other tables for this reason. Alternatively, if you don’t need your data after a certain period of time, back up your database and remove the data.

8. Partition your data correctly

These days, you don’t actually have to move old data out of a table to improve query performance. You can partition your table into a number of data segments based on a partition function. The query optimiser can use the partition function to look at rows only on the most appropriate filegroup. To create partitions, you need a partition function and a partition scheme.

CREATE PARTITION FUNCTION myRangePartitionFunction(int)

AS RANGE RIGHT FOR VALUES (1,100,1000)

Once the partition function is created, you can then apply the function to a partition scheme for a table.

CREATE PARTITION SCHEME myRangePartitionScheme

AS PARTITION myRangePartitionFunction

TO (filegrp1, filegrp2, filegrp3, filegrp4)

Then it’s just a matter of creating the table to use the partition scheme on the column you decided to partition on:

CREATE TABLE mySchema.myPartitionTable(

col1 int,

col2 nvarchar(100)

)

ON myRangePartitionScheme(col1)

9. Remove user-defined inline scalar functions

Inline scalar functions are convenient if you want to return a single value, but at the expense of performance. They look somewhat like stored procedures, and they can be used in SQL statements. The problem is that they are not expanded and therefore not optimised into the query plan by the query optimiser. Bad news, because it turns a Seek into a Scan. Queries like this may appear to be performant in the Execution plans and also in the IO statistics, but when you run the query, it can perform really really badly. No seriously, really bad.

Here’s an example of what I’m talking about:

CREATE FUNCTION dbo.fnGetPostalCode(

@Suburb nvarchar(100),

@State nvarchar(10))

RETURNS int

BEGIN

RETURN ISNULL(

(

SELECT PostalCode

FROM dbo.PostalCode

WHERE Suburb = @Suburb

AND State = @State

), -1 );

END

The following statement will only perform a clustered index scan, not a seek, and on a big table this could seriously affect performance.

SELECT s.SalesPersonID,

s.SuburbName,

s.State,

dbo.fnGetPostalCode(s.SuburbName,s.State)

AS PostalCode

FROM dbo.SalesPerson

You can have a look at the details by clicking on SQL Server Management Studio’s Query menu, and selecting “Include Actual Execution Plan”

One way to get around this is to simply inline the underlying query from the function, as follows:

SELECT s.SalesPersonID, s.SuburbName, s.State,

ISNULL( (SELECT PostalCode

FROM dbo.PostalCode

WHERE Suburb = s.SuburbName

AND State = s.State), -1)

AS PostalCode

FROM dbo.SalesPerson

Inline the SQL statement will perform significantly better.

10. Use APPLY

The apply statement was created for the situation where you put multiple inline nested queries in the one statement. For example, take the following statement:

SELECT soh.SalesOrderID,

Quantity=(SELECT TOP 1 (Quantity)

FROM Sales.SalesOrderDetails

WHERE SalesOrderID = soh.SalesOrderID),

UnitPrice=(SELECT TOP 1 (UnitPrice)

FROM Sales.SalesOrderDetails

WHERE SalesOrderID = soh.SalesOrderID)

FROM Sales.SalesOrderHeader soh

This performs an extra query, retrieving data from another table using the same criterion. This can now be replaced with the following:

SELECT soh.SalesOrderID, soh.OrderDate, a.*

FROM Sales.SalesOrderHeader soh

CROSS APPLY (

SELECT TOP (1) sod.UnitPrice, sod.Quantity

FROM Sales.SalesOrderDetail sod

WHERE sod.SalesOrderId = soh.SalesOrderId

ORDER BY sod.Quantity DESC

) as a

11. Use computed columns

Computed columns are derived from other columns in a table. By creating and indexing a computed column, you can turn what would otherwise be a scan into a seek. For example, if you needed to calculate SalesPrice and you had a Quantity and UnitPrice column, multiplying them in the SQL inline would cause a table scan as it multiplied the two columns together for every single row. Create a computed column called SalesPrice, then index it, and the query optimiser will no longer need to retrieve the UnitPrice and Quantity data and do a calculation – it’s already done.

12. Use the correct transaction isolation level

If there are a lot of rows in your table, multiple concurrent requests to that table could cause contention if the correct transaction isolation level is not set. If requests are repeatedly blocked, it could be time to consider whether to change.

For example, READ UNCOMMITED is equivalent to dirty reads, or NOLOCK. That is, if a transaction is in the middle of processing and you read a row, the data may not be valid, especially if multiple inserts/updates are occuring that require atomicity. This is the most performant and it ignores locking altogether, but is generally not allowed by good design and is a special case.

With READ_COMMITTED_SNAPSHOT, it specifies that any data read by the transaction will be the transactionally consistent version of the data that existed at the start of the transaction. Internally, it makes a versioned copy of the data and this is placed in tempdb until the transaction has competed. Except when the database is being recovered, snapshot transactions do not request locks when reading data, and therefore do not block other transactions from writing data. Transactions writing data also do not block other transactions reading data.

There are various other types of transaction options, including REPEATABLE_READ and SERIALIZABLE amongst others that you can look at to determine whether they are appropriate for your needs.

Steps for Creating CLR Trigger

The following are required steps for creating a CLR-Trigger of DML (After trigger) type for Insert action. This database Object is executed as the result of a user action against a table i.e. for an INSERT statement.

 Creating a .NET class of triggering action

 Making assembly(.DLL) from that Class

 Enabling CLR environment in that database.

 Registering the assembly in SQL Server

 Creating CLR Trigger using that assembly

1. Creating a .NET class

Here we can use any managed language that is supported by .Net Framework such as C++, C#, VB, J#, JScript or XAML, etc. As I am with VB, this managed code is in Visual Basic. Let us discuss the objective of this entity. According to the above example of "tr_User_INSERT" trigger, we have the UserTable for holding the user details. There is a field "Type" which explains the user role (ADMIN, End User, Register User etc.). Our objective is to check the role of the inserted User for ADMIN type and then do the action as we set.

Open the notepad, copy the following codes and save it as MyFirstAssembly.vb.

Listing 2 (.NET Class of Trigger type)

Imports System.Data

Imports System.Data.SqlClient

Imports Microsoft.SqlServer.Server

Partial Public Class MyFirstClrTrigger

<Microsoft.SqlServer.Server.SqlTrigger(Name= "checkUserRole", Target= "UserTable", Event = "FOR INSERT")> _

Public Shared Sub checkUserRole()

Dim triggContext As SqlTriggerContext = SqlContext.TriggerContext()

Dim userType As String = String.Empty

If triggContext.TriggerAction = TriggerAction.Insert Then

Using connection As New SqlConnection("context connection=true")

connection.Open()

Dim sqlComm As New SqlCommand

Dim sqlPipe As SqlPipe = SqlContext.Pipe()

sqlComm.Connection = connection

sqlComm.CommandText = "SELECT Type from INSERTED"

userType = sqlComm.ExecuteScalar.ToString()

If userType.ToUpper = "ADMIN" Then

sqlPipe.Send("Hello !!! You have the Admin role.")

sqlPipe.Send("We can use e-mail codes here to inform.")

End If

End Using

End If

End Sub

End Class

Let us go into the codes. There are two major Namespaces used, System.Data.SqlClient and Microsoft.SqlServer.Server. Microsoft.SqlServer.Server provides the SqlTriggerAttribute Class, which is used to mark a method definition in an assembly as a trigger in SQL Server. The Sqltrigger attribute requires some parameters to set the created trigger properties, such as Name - the name of the Trigger, Target - the table or view to which the trigger applies (in case of DML type) and Event - the event to fire the trigger for (e.g. FOR INSERT, DELETE and/or UPDATE or INSTEAD OF etc.). Again, that method must be a Static (Shared in VB) one; here checkUserRole() is the target method.

The SqlTriggerContext class provides the required triggering properties of the Trigger for doing action. TriggerAction property of this Class is the global enumeration TriggerAction type of Microsoft.SqlServer.Server namespace which indicates what action fired the Trigger.

Here we use the INSERTED table, which is automatically created and managed by SQL Server 2005. This is used to set the conditions of DML trigger action. There is also another called DELETED, used in case of delete action. The CLR triggers can access the Inserted or Deleted tables through SqlCommand object using context connection.

2. Making assembly (.DLL)

Now we have created the MyFirstAssembly.vb file with MyFirstClrTrigger class, so we need to compile the class to create an assembly. Here I use the VB compiler "vbc.exe," found in .Net Framework library. Use the following DOS command for creating assembly.

Listing 3 (compile the .VB file)

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727>vbc

/t:library /out:F:\CLR_Trigger_Test\CLR_Assembly\MyFirstAssembly

F:\CLR_Trigger_Test\MyFirstClrTrigger.vb

This command creates a DLL called MyFirstAssembly.dll at out path F:\CLR_Trigger_Test\CLR_Assembly\.

3. Enabling CLR environment

By default, the CLR functionality is trued off in SQL Server; so we need to enable it. Use the following T-SQL codes to enable the CLR functionality.

Listing 4 (enable CLR action)

------ Enabling CLR action in database -------

sp_configure 'clr enabled', 1

RECONFIGURE with Override

4. Registering the assembly

Our next step is to create an assembly in the Database, based on the compiled DLL (MyFirstAssembly.dll). The following T-SQL codes are useful regarding this objective.

Listing 5 (registering assembly)

------ Registering an Assembly -------

Create Assembly UserAssembly

From 'F:\CLR_Trigger_Test\CLR_Assembly\MyFirstAssembly.dll'

With Permission_Set=Safe

5. Creating CLR Trigger

Now we will create an extended Trigger using CREATE Trigger statement of T-SQL. There is a new clause named EXTERNAL NAME in SQL Server 2005, which allows us to reference a method of the Registered assembly. By doing so we set the triggering action of our Trigger using that managed code method of the assembly.

Listing 6 (creating a Trigger)

------ Creating a DML trigger in CLR -------

Create Trigger tr_User_CheckRole

on UserTable

For INSERT

External Name UserAssembly.MyFirstClrTrigger.checkUserRole

Here we use the checkUserRole() shared method of our MyFirstClrTrigger class of the registered assembly UserAssembly.

ASHIS KUMAR DAS

Welcome

Wednesday 24 August 2011

Twelve Tips For Optimizing Query Performance in SQL Server

How to create CLR Trigger

Important Links