Friday, September 4, 2015

Harm of Temporary Statistics and how to get rid of them

Those of you who hit query performance problems probably have heard a solution: "UPDATE STATISTICS".

Yep, sometimes it helps and you are starting to use it as a "Golden Hammer" and run "UPDATE STATISTICS" command on nightly basis.

Then your database grows and nightly time frame became too slow for full statistics update.
You start doing it on weekly basis.
Then you decrease percentage of the statistical sample.
Then you start rotating tables for statistical updates....
And you still have not enough time.

In addition to that, your SAN is working like crazy during that operation.

Will do a demonstration what causes this problem.

At first will run a profiler and capture execution of following command:
USE [AdventureWorks2014]
GO
UPDATE STATISTICS Person.Person;
GO
In my case it looks like this:
13918 IO reads.
Is it a lot for that table or not?


Let's look at list of temporary statistics for it:
USE [AdventureWorks2014]
GO
SELECT s.[name] AS [StatisticName]  
FROM sys.stats AS s  
WHERE s.object_id = object_id('Person.Person') and s.[name] like '[_]WA[_]Sys[_]%'  
ORDER BY s.[name];
GO

In my case I have 8 (eight) temporary statistics for that table!

I use following script to delete these temp stats, in your can you might have little bit different names.
USE [AdventureWorks2014];
GO
DROP STATISTICS Person.Person._WA_Sys_0000000D_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000009_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000008_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000006_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000005_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000004_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000003_693CA210;
DROP STATISTICS Person.Person._WA_Sys_00000002_693CA210;


Now will update statistics for that table and capture it by profiler again:


Isn't it nice? amount of IO, CPU time and duration decreased in about 3.5 times!

Immediate question: Where did I get all these crazy stats?
Immediate answer: Every time somebody runs SQL statement, which requires selection on a column SQL Server automatically creates Statistics on that column in case that column is not already have a statistic and AUTO_CREATE_STATISTICS option is ON for that database.


Do we still need to update all these useless statistics, which were used maybe once couple of years ago and since then we are spending our IO and CPU to update them?

You can definitely review what you use and what you don't, but when you are done, you can use following script to get rid of all these garbage stats you have in your database.

USE [AdventureWorks2014];
GO
DECLARE @SQLQuery NVARCHAR(MIN);
DECLARE SQLQueries CURSOR LOCAL STATIC FORWARD_ONLY
FOR
SELECT 'DROP STATISTICS [' + OBJECT_SCHEMA_NAME (s.object_id) + '].['     + OBJECT_NAME(s.object_id) + '].[' + s.[name] + '];'  
FROM sys.stats AS s
INNER JOIN sys.objects o ON s.object_id = o.object_id  
WHERE STATS_DATE(s.[object_id], s.[stats_id]) Is not Null and o.type = 'U'
         and s.[name] like '[_]WA[_]Sys[_]%';
OPEN SQLQueries;
FETCH NEXT FROM SQLQueries INTO @SQLQuery;  

WHILE (@@fetch_status <> -1)
BEGIN
    EXECUTE (@SQLQuery);
    FETCH NEXT FROM SQLQueries INTO @SQLQuery;
END

CLOSE SQLQueries;  
DEALLOCATE SQLQueries;

In case you want to restore some of these temporary statistics you can use that script:

USE [AdventureWorks2014];
GO
DECLARE @SQLQuery NVARCHAR(MIN);
DECLARE SQLQueries CURSOR LOCAL STATIC FORWARD_ONLY
FOR WITH ColumnList AS (
SELECT
    '[' + s.name + '].[' + t.name + ']' AS TableName,
    '[' + c.name + '] ' + CASE WHEN st.name = 'time' THEN '=''0:0''' WHEN st.name = 'uniqueidentifier' THEN '=NewID()'
    WHEN st.name = 'hierarchyid' THEN '=0x6AC0'
    WHEN st.name in ('geography', 'geometry') THEN '= geometry::STGeomFromText(''POINT(1 1)'', 4326);'
    WHEN st.name in ('datetime2', 'date', 'smalldatetime', 'datetime', 'sysname') THEN '=GetDate()'
    WHEN st.name in ('TEXT', 'varchar', 'nvarchar', 'nchar', 'sysname', 'ntext', 'char', 'sql_variant') THEN '=''0'''
    WHEN st.name in ('int', 'smallint', 'tinyint', 'real', 'money', 'float', 'bit', 'DECIMAL', 'NUMERIC', 'smallmoney', 'bigint', 'varbinary', 'binary', 'timestamp') THEN '=0'
    ELSE 'Is Null '
    END AS Where_Column
FROM sys.all_columns AS c
INNER JOIN sys.tables AS t ON c.object_id = t.object_id
INNER JOIN sys.schemas AS s ON s.schema_id = t.schema_id
INNER JOIN sys.types AS st ON st.user_type_id = c.system_type_id
WHERE st.name NOT IN ('xml')
)
SELECT DISTINCT 'DECLARE @r INT = (SELECT COUNT(*) FROM ' + TableName + ' WHERE' +
    SUBSTRING((SELECT ' AND ' + Where_Column FROM ColumnList AS i WHERE i.TableName = e.TableName FOR XML PATH('')),5,@@TEXTSIZE) + ');'
FROM ColumnList AS e;

OPEN SQLQueries;
FETCH NEXT FROM SQLQueries INTO @SQLQuery;

WHILE (@@fetch_status <> -1)
BEGIN
    PRINT @SQLQuery;
    EXECUTE (@SQLQuery);
    FETCH NEXT FROM SQLQueries INTO @SQLQuery;
END

CLOSE SQLQueries;
DEALLOCATE SQLQueries;
GO







No comments:

Post a Comment