|
|
[ ABOUT THE AUTHOR ]
Database developers need to write stored procedures which are not only fully functional, but also which perform acceptably. As database servers use permanent storage media heavily (mainly because of ACID properties), which are known for slow performance, optimizing the stored procedures for performance is very important. This article concentrates on some of the counters used to measure performance and analyses methods of capturing these counters. This article is intended for database developers who write stored procedures and optimize for performance.
Currently there are three counters widely used for measuring performance of the system.
-
Execution time
-
CPU Cost
-
IO Cost
Execution Time
The most primitive method is to get the time taken to execute the query from SQL Server Management Studio (SSMS). The status bar displays the time taken in terms of hour, minute and second. This may be a measure used when a query takes a longer time (Usually more than 10 seconds, so that a 10% improvement to the query could be measured) and the time difference in sub seconds is insignificant. When a query executes within a second, SSMS rounds the value. Thus this value could not be used to get the value if the query executes within sub seconds.
Another method is to get the system time before and after the execution of the stored procedure and analyze the difference. SQL Server management studio could be used for this purpose, simply by adding print statements before and after the stored procedure.
PRINT CONVERT(varchar, GETDATE(), 114)
The developer could also use variables to hold the values and calculate the time difference.
The advantage over this method is it could be used within a query too. For a stored procedure to be executed multiple times with different parameter for each time, print statement could be injected between each execution to analyze the time difference. For a multi statement procedure, Developers need to modify the stored procedure adding print statement, but it gives better control over the data to the developers. On the disadvantages side, one of the major points is its disability over getting time taken for compilation. Instead of PRINT statements, the time could be inserted into a table for further analysis is required. As datetime data type as it allows the value to be accurate up to 3 milliseconds the results may have vary with actual up to 3 milliseconds, and better than the previous method.
SQL Server has some other methods too.
SQL Server has a Set option which could be used to display the time taken: SET STATISTICS TIME ON
The setting could be reversed by using SET STATISTICS TIME OFF statement after the query. When this set option is ON. SQL Server will return message which may look similar to:
SQL Server Execution Times: CPU time = 109 ms, elapsed time = 164 ms.
Elapsed time is the execution time of the query.
However, there is another time involved in this query: The time taken to compile the query. To view that, the statement should be issued when SET STATISTICS TIME ON Statement is already executed. When a batch of statements is submitted, SQL Server goes through ALL of them but compiles one by one. As GO is considered as batch separator, inserting a GO statement between the SET command and the query will make SQL Server to consider each statement as a batch and compile and execute them separately. Executing the set command first and executing the query (i.e. in two batches) will also do the trick.
When done the messages may be different.
SQL Server parse and compile time: CPU time = 62 ms, elapsed time = 72 ms.
(1139 row(s) affected)
SQL Server Execution Times: CPU time = 109 ms, elapsed time = 167 ms.
In this case, the time taken by the server for execution is 167 milliseconds. In addition to this, 72 milliseconds have been taken for compilation. The unit of measurement is in milliseconds and smaller figures will be rounded. When a multi statement stored procedure is called, the total time may slightly differ from the calculation. For this reason, SQL Server provides a summary too. Additionally, some of the internal operations like creating worktables, statistics on temporary tables etc. may create additional load but not be captured as individual statements. However, they will be added to the final cost.
- IF OBJECT_ID('dbo.bsp_GetBigSales_ByTerritoryGroup') IS NOT NULL
- DROP PROCEDURE dbo.bsp_GetBigSales_ByTerritoryGroup
- GO
- CREATE PROCEDURE dbo.bsp_GetBigSales_ByTerritoryGroup
- @Group nvarchar(100)
- AS
- CREATE TABLE #Territory
- (
- TerritoryID int NOT NULL,
- TerritoryName nvarchar(100)
- )
- INSERT INTO #Territory (
- TerritoryID,
- TerritoryName)
- SELECT T.TerritoryID,
- T.[Name]
- FROM Sales.SalesTerritory T
- WHERE T.[Group] = @Group
- SELECT T.TerritoryName,
- SH.ShipDate,
- SUM(SD.LineTotal) AS ProductValue
- FROM #Territory T
- INNER JOIN Sales.Customer C
- ON T.TerritoryID = C.TerritoryID
- INNER JOIN Sales.SalesOrderHeader SH
- ON SH.CustomerID = C.CustomerID
- INNER JOIN Sales.SalesOrderDetail SD
- ON SH.SalesOrderID = SD.SalesOrderID
- GROUP BY T.TerritoryName,
- SH.ShipDate
- ORDER BY SH.ShipDate
- DROP TABLE #Territory
- GO
- SET STATISTICS TIME ON
- GO
- EXEC bsp_GetBigSales_ByTerritoryGroup
- @Group = N'Europe'
- SET STATISTICS TIME OFF
- GO
The above stored procedure contains four SQL statements;
-
Creates a Temporary table (Lines 8 - 12)
-
Inserts Territory data into temporary table (Lines 14 - 20)
-
Selects the required data (Lines 22 - 34)
-
Drops the temporary table (Line 36)
When the stored procedure is executed it produces multiple lines of messages. The number of lines and values may vary based on several factors.
SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 8 ms.
(3 row(s) affected)
(1139 row(s) affected)
SQL Server Execution Times: CPU time = 93 ms, elapsed time = 549 ms.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 11 ms.
SQL Server Execution Times: CPU time = 93 ms, elapsed time = 568 ms.
The first two sets of messages are compilation related. As we have four SQL statements, the next four statements refer the time taken for execution. The fifth line specifies the total time taken for the execution of the stored procedure. Users can note that the sum of time on each statement exceeds the value on the last line.
When a query with single statement or a few statements is used, this figure may be acceptable to use. When a stored procedure with multiple statements is executed, it will return multiple lines of information and it is difficult to get the values for each statement. This becomes extremely difficult if the stored procedure contains a while loop. Additionally, it is difficult to export the value into a table or spreadsheet for further calculation unless string manipulation is used.
Another method is to use Profiler. In profiler, the events that could be used for getting duration are
Stored Procedures
* RPC: Completed * Sp: Completed * Sp: Stmt Completed
TSQL
* SQL: Batch Completed * SQL: Stmt Completed
The profiler could be set up to capture these events with these events by any user who has ALTER TRACE permission. Alternately, users could use a built-in template named Tuning too. Once set, Until it is stopped (or closed) it will capture the data. text data, start time and end time are additional columns which could be used. These two columns use SQL Server's date time data type and can hold data with 3/10 milliseconds accuracy. Users can include some more columns including CPU, Reads & Writes which is covered later in this article. To get maximum performance, the user should add filters on the application name, database ID (This ID could be obtained by executing the statement SELECT DB_ID() against the database or SELECT DB_ID('<databaseName>') against the server.), login name or host name (client computer which connects to the database server) could be used to get only the exact information needed. However, Profiler does not provide a mechanism to capture the time taken for compilation.
The duration in profiler is by default measured in milliseconds. However, SQL Server 2005 includes an option to measure the duration in microseconds. It is useful when examining statements which do not take much time but may being executed very frequently.
Running profiler from a client computer against a busy server is not recommended. It creates comparatively high volume of network traffic. Server side trace events could be used during the situation. It produces an output file which could be analyzed using profiler. Users can create the script for server side trace using Profiler from File -> Export menu
SQL Server 2005 provides feature rich dynamic management views and functions which displays the information on query execution time. sys.dm_exec_query_stats displays the information on elapsed time. In addition to elapsed time, this view displays time related to CLR code. The following code displays the details of cached executions for all code executed against AdventureWorks:
SELECT query.objectid ,SUBSTRING(text, stat.statement_start_offset/2, (CASE WHEN stat.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(MAX),query.[text])) * 2 ELSE stat.statement_end_offset END - stat.statement_start_offset)/2) as query_text ,stat.execution_count ,stat.total_worker_time ,stat.last_worker_time ,stat.min_worker_time ,stat.max_worker_time ,stat.total_clr_time ,stat.last_clr_time ,stat.min_clr_time ,stat.max_clr_time ,stat.total_elapsed_time ,stat.last_elapsed_time ,stat.min_elapsed_time ,stat.max_elapsed_time FROM sys.dm_exec_query_stats stat CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS query WHERE [query].dbid = DB_ID('AdventureWorks')
Issues with Elapsed time
Elapsed time of a stored procedure may vary even in the same server at different times based on the number of operations in progress at that time, the resources it uses, number of pages cached and the parameters passed. Let's have a look on these reasons
-
A Change in resources available for query processing. This could happen when another process is running in parallel. It could be either some users running a resource intensive query, or SQL Server Agent may start certain jobs which may take some resources. Further, if the same computer could be used to run many services and a process running on a different service too could be the cause of the problem. However, developers can evaluate the stored when no other process is happening. Most of the jobs could be stopped temporarily. Creating such a situation in development environment is highly possible.
-
The plan would have been compiled during the first execution but reused during the second execution. Also, DBCC FREEPROCCACHE command could be used to remove the cached procedure information from memory. It will ensure the procedure is recompiled again and again and the same execution plan is not re-used. (The execution plan may be the same in terms of content, but the DBCC command ensures that it was recreated and not reused)
-
A check point process may occur in the middle of the executions which unload the data pages from memory, or some of the pages which were not available in the cache during the first execution and loaded into memory for the execution are used by the second execution. The check point process too could be controlled using command CHECKPOINT. DBCC DROPCLEANBUFFERS command could be issued to clean the buffers so that the data will always be read from the disk and not from memory.
-
Based on the values passed for the parameters, the number of data pages to be loaded may differ. They may cause even a recompile at certain statements within a stored procedure, and the execution plan may not be the same.
All the solutions provided above are good if the database is isolated for other users. But, if the developers are working against a centralized database server it may not be possible to run all tests during a time when no one else is working. In this case, running the same stored procedure for multiple times and analyzing the average time taken could be a good solution. Remember the application should issue the command multiple times for RPC: Completed event to capture the counters well.
CPU Cost
CPU cost will show how much of CPU cost is taken to execute the stored procedure. CPU is a valuable resource and taking more from CPU may make not only the database but the entire server system to be non-responsive. Adding a few cycles may not cost much high but, when many processes are doing that, the server may go into high CPU usage.
As seen before, SET STATISTICS TIME ON command displays the CPU time taken. Again, this option could not be used to get the information into a table or spread sheet application for further analysis. However, this refers the actual time taken for execution and not the estimated time calculated by the compiler
Again, Trace events (wither from server side or from client side using Profiler) could be used to capture the CPU usage. This is the actual usage during the execution and not the expected CPU time. The best events to capture CPU time are
Stored Procedures
* RPC: Completed * Sp: Stmt Completed
TSQL
* SQL: Batch Completed * SQL: Stmt Completed
SP: Completed event under stored procedure section cannot be used to capture CPU cost.
@@CPU_BUSY system global variable displays the time (in ticks) the CPU was busy since the last restart. It adds the time from all CPUs. When that number is multiplied by another global variable @@TIMETICKS it results the microseconds the CPU was busy. By selecting the values before and after the execution of the query, The CPU time taken by the query could be calculated. However, @@CPU_BUSY always return a whole number.
The set commands (SET SHOWPLAN_ALL ON, SET SHOWPLAN_XML ON and ET STATISTICS PROFILE ON) display the not the actual CPU cost, but the estimated cost along with other information. When the first two commands are issued, only the estimated execution plan is generated and the expected CPU cost (and expected IO cost) which the (actual) execution plan used will be displayed. Even though the last command does not prevent the code from being executed, it also displays the estimated cost from the plan used for execution.
The estimated costs do play a part when analyzing a query and optimizing it. However, it cannot be used to measure the performance of a query.
Windows Operating system comes with a tool called PerfMon.exe. The display name of the utility differs from operating system to operating system. In Windows XP and Windows 2003 it is named as "Performance". During the Windows 2000 days it had the name as "System Monitor". Even before, it was called as "Performance Monitor". This tool is mainly used to measure the activities happening at the server. Even though it has mechanism to monitor CPU usage Disk activity and Memory usage (and many more), it is not useful to monitor the resource usage as per a single query. Rather, it could be used to monitor the activities and the change when large column of transactions happen. This article excludes this tool from evaluation.
IO Cost
IO cost is one of the critical costs of the system. As discussed in the beginning, IO is one of the critical resources and unless IO cost is kept low, the performance of the stored procedure will go down. Cached pages (Cached page = 8kb of data loaded from hard disk into memory) may increase the performance by not reading the data from hard disk.
SET STATISTICS IO ON is one of the commands that could be used to identify the IO cost. Again, it has the same issue of, SET STATISTICS TIME ON as it prints the statement in the messages tab, and it is difficult to process. A typical output looks like this:
(2750 row(s) affected) Table 'SalesOrderHeader'. Scan count 1, logical reads 1403, physical reads 1, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-
The above lines stats quite a few information against this query which uses Sales.SalesOrderHeader:
-
A total of 1403 pages were read to complete the query (logical reads 1403).
-
Out of the above pages, only one page was read (physical reads 1) from disk and the rest were either already in memory or read into data cache by the read-ahead mechanism.
-
1 page was read into the data cache by the read-ahead mechanism.
-
The pages were read from disk through a single pass scan.
-
The rest refers to large objects and if there is no large object (like nvarchar(MAX) or text) involved in the query, the figures will show zero values.
In this example, even though logical scan shows as 1403, it does not mean the query has processed 14903 distinct pages. If a particular page is scanned twice, the local scan will consider it as two. To get clear picture of actual pages scanned, DBCC DROPCLEANBUFFERS should be executed (to make the pages loaded before to be dropped) and the sum of physical and read-ahead reads to be calculated.
These statistics related commands can be combined with Print statements to identify the cost involved in each statement. These are really useful to identify the IO operations and the cost of them.
Using profiler (or server side trace events) the read and write (of pages) occurred could be captured. It will differentiate the pages read and pages written separately, but not the pages already in memory and read from disk. All the events mentioned for CPU cost could be used to capture reads and writes.
The dynamic management view sys.dm_exec_query_stats also could be used to view IO Cost. Simply adding the following columns in the query specified to measure elapsed time will give the details:
-
total_physical_reads
-
last_physical_reads
-
min_physical_reads
-
max_physical_reads
-
total_logical_writes
-
last_logical_writes
-
min_logical_writes
-
max_logical_writes
-
total_logical_reads
-
last_logical_reads
-
min_logical_reads
-
max_logical_reads
This view does not separate read-ahead reads and LOB related reads/writes. However, this could be used as calculated statistics against a query.
The details of this view will be available until the cache is cleared. If a stored procedure is not used for some time it will be removed from memory. Until such time this view could be used to gather data.
In the development environment, it is better to clear the cache (both data cache and proc cache) before starting with any tests.
Additionally, @@IO_BUSY also be used to measure the IO time. Like the CPU counterpart this system variable displays the number of ticks which the IO was busy. By using the same calculation method used to calculate CPU time, the microseconds of IO time also could be calculated.
What is important
The importance of these counters may differ from environment to environment and based on the nature and time of operation. For example during busy hours for some organizations it may be more important to serve as many requests as possible rather than responding faster. (A job could be done faster if all resources are used; but it may lead to system hanging for some time or no response to other requests.) Additionally, performance is also works with the mindset of the end users (users complain only when their expectations are not met!) and the developers need to understand and meet these expectations. When writing the stored procedures, developers need to analyze the critical area in the system they are working with, and choose the importance to the counters appropriately.
Conclusion
This article describes different counters used for measuring performance and ways of calculating them. This article also described on the issues with those methods and ways of overcoming them. These counters are useful for measuring how a stored procedure behaves and to optimize them.
[ RATE this ARTICLE, and COMMENT on it ] [ ABOUT THE AUTHOR ]
|