Wednesday, October 3, 2007

Statistics Update

This is an important task in terms of database maintenance. This ensures the statistics are up-to-date which in turn would ensure that the query optimizer doesn’t land up with sub-optimal plans. The two ways to monitor for the need for an update on statistics are:

1. STATS_DATE function ( 2. rowmodctr value in the sysindexes (For SQL 2000) and sys.sysindexes (For SQL 2005) output

The STATS_DATE function would give you the date when the statistics were updated last. The rowmodctr value would tell you how many changes have taken place in terms of update, inserts and deletes in the data of the column the index is associated with.

The frequency of statistics update depends on the following:
1. Do BULK operations happen on the database tables in question?
2. Is there an off-peak period where a statistics update with full scan can be done?
3. How often is the data modified in the database tables in question?

Statistics update can be a resource intensive operation depending on the size of the table. If the data in the tables in question change very rarely, then a statistics update with a full scan can be done during a maintenance window. Statistics update is always an ONLINE operation and doesn’t cause the database to be in an OFFLINE mode.
If there are BULK operations happening in a table, then statistics have a tendency to getting skewed after the BULK operation. The best option is do perform a statistics update on this table if this tables is to be used by an application after the bulk operation.
If the database is updated very frequently and the database is very large in size, then it needs to be determined which tables are most frequently updated. Based on this, a statistics updated can be performed for only those tables and a statistics update can be done for the entire database with a lower sampling rate like 20-30% depending on what is suitable. This can be determined by comparing historical data and finding out what kind of sampling rate is suitable for your needs.
Another option is to enable AUTO UPDATE STATISTICS option for the database. But it needs to be monitored whether this is a boon or a bane. AUTO UPDATE STATISTICS can be good when the statistics are relatively updated and it makes sure that the statistics don’t fall too out of place. This feature has a downside when the statistics become out-dated too frequently, then you would have AUTO UPDATE STATISTICS being fired every time and this would cause all cached plans associated with the rows whose statistics have been updated to be recompiled. This can cause a serious bottleneck if there are too many auto update statistics events fired in short time span.

The following methods can be used to update the statistics:
1. UPDATE STATISTICS WITH FULLSCAN (or a lower sampling rate) Please refer the following article for further details:
2. sp_updatestats (
3. A maintenance plan or SQL Agent job to update the statistics

The following script for SQL Server 2005 would be helpful in determining how badly affected the statistics are for the index associated with it. The rowmodctr counts the total number of inserted, deleted, or updated rows since the last time statistics were updated for the table.

A DBCC DBREINDEX would update the statistics associated with those indexes. But this would not update the AUTO CREATED STATISTICS. The DBCC SHOW_STATISTICS ( command could also help you to determine the statistics condition for the particular index. The “Row Sampled” and “Rows” if equal would indicate that the sampling is currently 100%.

Statistics for INDEX 'pk_customers'.

Updated Rows Rows Sampled Steps Density Average key length -------------------- -------------------- -------------------- ------ ------------------------ ------------------------
Jun 23 2007 5:03PM 91 91 91 1.0989011E-2 10.0

The following articles could be helpful:

Statistics used by Query Optimizer in SQL Server 2005

No comments: