DB2 relies on a cost-based optimizer to choose the optimal access path. There are many aspects and factors that influence or affect the optimizer decision. System configuration, such as buffer pool sizes, sortheaps, cpu parallelism, i/o characteristics - as well as, indexes and constraints to name just a few. Catalog statistics play a major role in influencing the optimizer, so it's no surprise that the lack of statistics can be detrimental.
So here are a few things to examine before going down the path of hints or optimizer profiles.
I would examine both the type and level of runstats being executed and review the registry variables that may affect optimization.
There are several flavors of statistics that DB2 uses in optimization selection. There are:
basic statistics : number of rows, pages, and active blocks in tables, number of data values and length of data values for a column, data range information.
non-uniform statistics :
column group statistics : number of distinct values in a group of columns ; important for correlation.
detailed index statistics: clustering details ; used to estimate data fetch pages and model I/O vs buffer sizes.
user-defined statistics: UDF statistics to specify I/O and CPU costs.
These statistics are stored in the system catalog and can be updated if necessary using the db2look utility. One reason you may want to update statistics is to mirror or replicate production statistics in a non-production environment. The db2look utility can easily do this using the -m (mirror) option to extract statistics. This article covers the use of that utility in more detail :
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0508kapoor/
So, statistics and the type or level of statistics play a major role in optimization. Thus, the lack of statistics can often be detrimental to an application.
One reason DB2 sometimes does not choose the most optimal path is simply we have not supplied DB2 with enough information, or level of statistics, for DB2 to make the appropriate optimization selection.
There are some newer options available on the runstats utility to improve performance by providing more specific information. Let's take a look at one I've found to be helpful.
Column group statistics. Column group statistics provide information used to detect correlation.
Without this information DB2 may underestimate the join cardinality. Column group statistics can be particularly useful in equi-join predicates. For example:
where tablea.col1 = tableb.col1 and tablea.col2 = tableb.col2
An example runstat might be:
Runstats on usera.tablea
on all columns <--- gathers basic column information
and columns (( col1, col2) <-- gathers column group information.
Column group information is stored in SYSCOLGROUPS and SYSCOLGROUPSCOLS.
Now let's take a look at some of the registry variables that can affect optimization. You can find detailed information on these registry variables at the information center.
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r0005664.htm
The query optimization configuration parameter (DFT_QUERYOPT), as well as, several registry variables affect the access plan chosen. DB2_REDUCED_OPTIMIZATION=UNIQUEINDEX is a registry variable that will be default starting in V9.5. It instructs the optimizer to use the unique index that is fully qualified to access a table.
So, you feel you have exhausted all other options and are wondering how to implement a hint.
1) Set the registry variable DB2_OPTPROFILE=YES (this is needed in V8.2 and is default in V9)
2) Create the SYSTOOLS.OPT_PROFILE table to house the profile
3) Import the profile into the table. For example:
IMPORT FROM file1.del OF DEL MODIFIED BY LOBSINFILE INSERT INTO SYSTOOLS.OPT_PROFILE;
4) Test your troubled SQL and ensure it is know taking the access path you would like it to traverse.
db2 connect to
db2 set current optimization profile="
db2 set current explain mode explain;
db2 -tvf query.txt
There are a couple good places to get detailed information about optimization profiles and statistical views. If you need more detail I'd suggest the following:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0612chen/index.html
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/t0024532.htm
Optimization profiles can be used in 8.2 and above. If you are still using DB2 LUW 8.1 where the optimization profile is not available, you may want to investigate the use of the "SELECTIVITY" clause. You can find more information about using selectivity clauses on SQL statements at the following developer works article:
http://www.ibm.com/developerworks/db2/library/tips/dm-0312yip/index.html