Selecting column for index

As you must be knowing, indexing the column can boost the performance by leaps and bounds. But you can achieve this only if you index proper column.

The index work as follows:

1. You index a column.

2. Database creates a mapping for each value of that column to where it can find it row.

3. There can be more than one row for given value (index collision,which is generally searched sequentially)

4. When you are looking for a row with particular value, database searches the index to find out where it can get the actual row for that value.

5. Then reads the actual data from physical location.

Now this process can degrade the performance if most column values are similar. In that cases collision occurs frequently and the overhead of the index outperforms its benefits. One way to find out a column for indexing is calculate the 'selectivity' for it. In the simplest term it is the % of distinct keys present in the column.

You can use following query to find out the no of different keys for a table;

SELECT   column_name, num_distinct,
         num_distinct / (SELECT COUNT (*)
                           FROM employee) AS SELECTIVITY
    FROM user_tab_columns
   WHERE LOWER (table_name) = 'employee'
ORDER BY num_distinct DESC

 

COLUMN_NAME NUM_DISTINCT SELECTIVITY
     
EMP_ID 52754 0.999526327
EMP_NAME 52753 0.99950738
EMP_FULLNAME 51964 0.984558252
EMP_FIRSTNAME 29898 0.566475303
EMP_LASTNAME 23749 0.449970632

The closer the selectivity to 1 , better is the column to index.

If you have to index column with low selectivity, try to use composite index ( having more than one column). This might lead to a good selectivity composite index. e.g If Employee table contains Department Id, with

low selective , it can be combined with the project Id to get better selectivity.

If you are trying to index a string column and your search criteria for it is like '%pattern%'; the index may be ignored by the database. In such case if you can change search criteria to '%pattern' (you are searching for the values that 'ends with' given pattern); then you can use reverse index.

 

 

 

 

 

 

No comments: