=============
Index Advisor
=============

Index Advisor helps you determine the application tables (and columns) 
on which you should create common B-tree type indexes. This can reduce the 
execution cost of queries you expect to use on your tables.

Index Advisor comes pre-installed with Postgres Plus(R) Advanced Server.

Index Advisor works with Advanced Server's query planner by creating
"hypothetical indexes" for the query planner to use to calculate execution
costs if such indexes were available.

Index Advisor determines the hypothetical indexes based upon SQL queries
you supply. Generally, when using Index Advisor, you give the query in an
EXPLAIN statement. An EXPLAIN statement displays the query plan and
estimated execution cost for the supplied query, but does not run the query.

Index Advisor compares execution costs with and without the hypothetical
index. If the execution cost using the hypothetical index is less than the
execution cost without it, the plans with and without the hypothetical index
are reported in the EXPLAIN statement output, certain metrics to quantify
the improvement are calculated, and the CREATE INDEX statement needed to
actually create the index is generated.

If no hypothetical index can be found that reduces the execution cost,
only the "normal" query plan output of the EXPLAIN statement is displayed.

Note: Index Advisor does not create actual indexes on the tables. You
      must add any recommended indexes you wish to use with CREATE INDEX
      statements.

A SQL script is provided that you can run to create a table that Index
Advisor uses to log its recommendations so you can view them at any time
during or after the Index Advisor session. The script also creates a
function and a view of the table to simplify the retrieval and
interpretation of the results.

You can choose to forego running the script in which case the log table,
the function, and the view of the log table are not created. Index Advisor
then logs its results in a temporary table that is available only for the
duration of the session in which you are using Index Advisor.

There are two ways in which you can use Index Advisor to analyze SQL
queries in order to make indexing recommendations:

  - Run an Index Advisor utility program supplying it with a text file
    containing the SQL queries for which you want Index Advisor to make
    indexing recommendations. Index Advisor generates a text file with
    CREATE INDEX statements for the recommended indexes.

  - Use either the edb-psql or the psql command line terminal program to
    run SQL queries for which you want Index Advisor to make indexing
    recommendations.

Note: The "SQL query" you supply for which Index Advisor attempts to make
      indexing recommendations includes INSERT, UPDATE, and DELETE
      statements as well as SELECT.


==========
COMPONENTS
==========

The following Index Advisor components reside in your Postgres Plus Advanced 
Server home directory:

  - index_advisor.so (for Linux)
    index_advisor.dll (for Windows)

      Index Advisor plugin that interacts with the query planner to make
      indexing recommendations.

      The Index Advisor plugin is located in the libdir subdirectory
      of the Postgres Plus Advanced Server home directory.  Please note that
      libraries in the libdir directory can only be loaded by a superuser. 

      A database administrator can allow a non-superuser to use Index Advisor 
      by manually copying the index_advisor file from the libdir directory into 
      the libdir/plugins directory (under your Advanced Server home directory). 
      Only a trusted non-superuser should be allowed access to the plugin; this
      is an unsafe practice in a production environment.

  - pg_advise_index

      Utility program that reads a user-supplied input file containing SQL
      queries and produces a text file containing CREATE INDEX statements
      that can be used to create the indexes recommended by the Index
      Advisor.

      The pg_advise_index program is located in the bin subdirectory of the
      Postgres Plus Advanced Server home directory.

  - index_advisor.sql

      Script you can optionally run to create a permanent Index Advisor
      log table along with a function and view to facilitate reporting
      of recommendations from the log table.

      The script is located in the share/contrib subdirectory of the
      Postgres Plus Advanced Server home directory.

The following are the database objects used with Index Advisor. You must
run script index_advisor.sql to create these objects. If you choose to
create and use these database objects, they must be created in a schema
that is accessible by, and included in the search_path of the database
user name that will be used to run Index Advisor.

  - index_advisor_log

      Table in which indexing recommendations are logged. If you used
      script index_advisor.sql to create this table, and the table is
      accessible in the Index Advisor user's search_path, Index Advisor
      logs its recommendations in this table.

      If the index_advisor_log table cannot be found in the Index Advisor
      user's search_path, then Index Advisor creates a temporary table
      with this name to store its recommendations. This temporary table
      exists only for the duration of the current session.

  - show_index_recommendations

      PL/pgSQL function that interprets and displays the recommendations
      made during a specific Index Advisor session identified by its
      backend process ID.

  - index_recommendations

      View on the index_advisor_log table that produces output in the
      same format as the show_index_recommendations function, but for
      recommendations produced in all sessions. Also, it does not include
      the recommendations for tables that no longer exist.


=============
CONFIGURATION
=============

You have two options:

  a) You can begin using Index Advisor without performing any configuration
     steps in which case the Index Advisor recommendations are available
     only for the duration of the Index Advisor session, or

  b) You can perform the following configuration steps in which case the
     Index Advisor recommendations are saved and are available after any
     Index Advisor session.

If you wish to save your Index Advisor recommendations using option b,
perform the following steps:

1. Determine which schema you want to use to store the Index Advisor
   database objects. Whichever schema you choose must have the following
   characteristics:

   a) The schema must have USAGE privilege granted on it to users who will
      be running Index Advisor and querying the results.

   b) The schema must be included in the search_path of these users while
      they are utilizing Index Advisor.

2. Run script index_advisor.sql so that these database objects are created
   in the schema chosen in Step 1. The database objects are created in the
   first available schema in the search_path of the current user when the
   script is run.

3. If needed, grant privileges on the Index Advisor database objects to
   Index Advisor users.

     - Grant SELECT and INSERT privileges on table index_advisor_log.

     - Grant DELETE privilege on table index_advisor_log if you want to
       give the user the capability to delete the contents of the
       index_advisor_log table.

     - Grant SELECT privilege on view index_recommendations.

   Note: This step is not necessary if the Index Advisor user is a
         superuser, or if the Index Advisor user is the owner of these
         database objects.

The following example shows the creation of the Index Advisor database
objects in a schema named ia, which will then be accessible to an Index
Advisor user with user name ia_user:

  $ edb-psql -d edb -U enterprisedb
  edb-psql (<meta_installer_version>)
  Type "help" for help.

  edb=# CREATE SCHEMA ia;
  CREATE SCHEMA
  edb=# SET search_path TO ia;
  SET
  edb=# \i /opt/PostgresPlus/<version_number>/share/contrib/index_advisor.sql
  CREATE TABLE
  CREATE INDEX
  CREATE INDEX
  CREATE FUNCTION
  CREATE FUNCTION
  CREATE VIEW
  edb=# GRANT USAGE ON SCHEMA ia TO ia_user;
  GRANT
  edb=# GRANT SELECT, INSERT, DELETE ON index_advisor_log TO ia_user;
  GRANT
  edb=# GRANT SELECT ON index_recommendations TO ia_user;
  GRANT

Where:
   <meta_installer_version> is the version number of the Advanced Server
     installer.
   <version_number> is the Advanced Server version number (e.g., 9.3AS).
    
Schema ia must be included in the search_path of ia_user whenever Index
Advisor is utilized by ia_user.


=====
USAGE
=====

Index Advisor must be supplied with a workload, which is a set of queries
that is expected to be executed by the application.

After running Index Advisor with the workload, you then analyze the results
generated by Index Advisor.

Finally, you can create the recommended indexes using the CREATE INDEX
statements generated by Index Advisor.


Running Index Advisor
---------------------

There are two ways to run Index Advisor:

  - Using the pg_advise_index Utility

  - Using edb-psql or psql

Each method is discussed in the following sections.

Note: Do not run Index Advisor in read-only transactions. An error is
      thrown upon insert into the index_advisor_log table if run in a
      read-only transaction.

For the following examples, it is assumed that superuser enterprisedb
is the Index Advisor user, and the Index Advisor database objects have
been created in a schema in the search_path of superuser enterprisedb.

The table to be analyzed by Index Advisor is created as shown below:

  CREATE TABLE t( a INT, b INT );
  INSERT INTO t SELECT s, 99999 - s FROM generate_series(0,99999) AS s;
  ANALYZE t;

The table contains rows as follows:

     a   |   b
  -------+-------
       0 | 99999
       1 | 99998
       2 | 99997
       3 | 99996
         .
         .
         .
   99997 |     2
   99998 |     1
   99999 |     0


Using the pg_advise_index Utility
---------------------------------

1. Create a file containing the queries that are expected to be executed
   by the application. Each query must be terminated by a semicolon. The
   queries may be on the same line or on separate lines.

   The following example shows the contents of file workload.sql:

     SELECT * FROM t WHERE a = 500;
     SELECT * FROM t WHERE b < 1000;

   Note: In the file do not start the queries with the EXPLAIN keyword.

2. Run the pg_advise_index program as shown by the following example:

     $ pg_advise_index -d edb -h localhost -U enterprisedb -s 100M -o advisory.sql workload.sql
     poolsize = 102400 KB
     load workload from file 'workload.sql'
     Analyzing queries .. done.
     size = 2184 KB, benefit = 1684.720000
     size = 2184 KB, benefit = 1655.520000
     /* 1. t(a): size=2184 KB, benefit=1684.72 */
     /* 2. t(b): size=2184 KB, benefit=1655.52 */
     /* Total size = 4368KB */

   The information displayed by the pg_advise_index program is also logged
   in the index_advisor_log table if it exists in the user's search_path.

   Note: The options -d, -h, and -U are edb-psql connection options.

         -s is an optional parameter that limits the maximum size of an
         index that Index Advisor can recommend. If the output file is
         empty, -s might be set too low.

3. The recommended indexes are found in the file specified by the -o option.

   The following are the CREATE INDEX statements found in the output file
   advisory.sql:

     create index idx_t_1 on t (a);
     create index idx_t_2 on t (b);

   You can then use this file to create the indexes as shown by the
   following:

     $ edb-psql -d edb -h localhost -U enterprisedb -e -f advisory.sql
     create index idx_t_1 on t (a);
     CREATE INDEX
     create index idx_t_2 on t (b);
     CREATE INDEX


Using edb-psql or psql
----------------------

You can run Index Advisor from within either the edb-psql or psql program
as follows.

1. Connect to the server with edb-psql, and load the Index Advisor plugin:

        $ edb-psql -d edb -U enterprisedb
            .
            .
            .
        edb=# LOAD 'index_advisor';
        LOAD

2. Run the EXPLAIN statement for every query that you think will be
   executed by the application. Index Advisor stores the recommendations
   in the index_advisor_log table. If the table does not exist in the
   user's search_path, a temporary table is created with the same name.
   This temporary table exists only for the duration of the user's
   session.

   The EXPLAIN statement displays the normal query plan. If any
   hypothetical indexes are suggested, another plan using the hypothetical
   index is displayed below the normal plan as shown by the following
   examples:

     edb=# EXPLAIN SELECT * FROM t WHERE a < 10000;
                                               QUERY PLAN
     ----------------------------------------------------------------------------------------------
      Seq Scan on t  (cost=0.00..1693.00 rows=10105 width=8)
        Filter: (a < 10000)
      Result  (cost=0.00..337.10 rows=10105 width=8)
        One-Time Filter: '===[ HYPOTHETICAL PLAN ]==='::text
        ->  Index Scan using "<hypothetical-index>:1" on t  (cost=0.00..337.10 rows=10105 width=8)
              Index Cond: (a < 10000)
     (6 rows)


     edb=# EXPLAIN SELECT * FROM t WHERE a = 100;
                                            QUERY PLAN
     ----------------------------------------------------------------------------------------
      Seq Scan on t  (cost=0.00..1693.00 rows=1 width=8)
        Filter: (a = 100)
      Result  (cost=0.00..8.28 rows=1 width=8)
        One-Time Filter: '===[ HYPOTHETICAL PLAN ]==='::text
        ->  Index Scan using "<hypothetical-index>:3" on t  (cost=0.00..8.28 rows=1 width=8)
              Index Cond: (a = 100)
     (6 rows)

3. Whether or not you use EXPLAIN before a query, Index Advisor analyzes
   and logs the data collected from all queries in that session once the
   Index Advisor plugin is loaded.

   You can temporarily turn off Index Advisor while still connected to the
   edb-psql session by turning off the Global User Configuration (GUC)
   parameter named index_advisor.enabled:

     edb=# SET index_advisor.enabled TO off;
     SET

   When this parameter is off, queries are no longer analyzed through the
   Index Advisor plugin, and no indexing recommendations are made nor
   logged. To re-enable Index Advisor set the parameter back on:

     edb=# SET index_advisor.enabled TO on;
     SET

   Whenever you start a new edb-psql session, index_advisor.enabled is on
   by default once the Index Advisor plugin is loaded. The Index Advisor
   plugin must be loaded in order to SET or SHOW index_advisor.enabled.

4. When you are done issuing the queries, and if you are still connected
   within the same edb-psql session, you can query the contents of the
   index_advisor_log table to see the results. If you created a permanent
   index_advisor_log table by running script index_advisor.sql, you can
   query the results after the current session as well.

   If you ran script index_advisor.sql, a few other objects are created for
   your convenience.

   For example, you can see the recommendations of a particular session by
   giving the process ID (pid) of the session in the following PL/pgSQL
   function using any of the forms shown below:

     SELECT show_index_recommendations( pid );

     SELECT show_index_recommendations( pg_backend_pid() ); -- Current session

     SELECT show_index_recommendations( null ); -- Current session

   The following is an example:

     edb=# SELECT show_index_recommendations(null);
                                      show_index_recommendations
     --------------------------------------------------------------------------------------------
      create index idx_t_a on t(a);/* size: 2184 KB, benefit: 3040.62, gain: 1.39222666981456 */
     (1 row)

   "create index ..." is the SQL statement needed to create the recommended
   index.

   For all the queries analyzed in the session, a set of metrics is
   calculated for each recommended index displayed by the fields labeled
   "size", "benefit", and "gain" as shown above.

   The meanings of fields size, benefit, and gain are discussed in the
   following section. Simply put, the larger the value of benefit and
   gain, the better the cost effectiveness of using the recommended index.

   If you do not know the process ID, you can display the results of all
   Index Advisor sessions from the following view:

     SELECT * FROM index_recommendations;

   The output of this statement is illustrated and discussed in the next
   section.


Interpreting the Results
------------------------

There are several ways you can view the results generated by Index Advisor:

  - Query table index_advisor_log

  - Run function show_index_recommendations (illustrated above)

  - Query view index_recommendations

  Note: The above objects are created by running the index_advisor.sql script.

The following sections describe the first and third methods listed above
along with the meaning of the results.

Query Table index_advisor_log
-----------------------------

Index Advisor inserts its recommendations into a table named index_advisor_log.

Each row in the index_advisor_log table contains the result of a query where
Index Advisor determines it can recommend a hypothetical index to reduce
the execution cost of that query.

The table contains the following columns:

  Column       | Type      | Description
  -------------+-----------+-------------------------------------------------
  reloid       | oid       | OID of the base table for the index
  relname      | name      | Name of the base table for the index
  attrs        | integer[] | Column positions within the table of the index
  benefit      | real      | Calculated benefit of the index for this query
  index_size   | integer   | Estimated index size in disk-pages
  backend_pid  | integer   | ID of the process generating this recommendation
  timestamp    | timestamp | Date/Time when the recommendation was generated

The following example shows the index_advisor_log table resulting from two
sessions using Index Advisor, each session resulting in recommendations based
on two queries.

  edb=# SELECT * FROM index_advisor_log;
   reloid | relname | attrs | benefit | index_size | backend_pid |            timestamp
  --------+---------+-------+---------+------------+-------------+----------------------------------
    16651 | t       | {1}   | 1684.72 |       2184 |        3442 | 22-MAR-11 16:44:32.712638 -04:00
    16651 | t       | {2}   | 1655.52 |       2184 |        3442 | 22-MAR-11 16:44:32.759436 -04:00
    16651 | t       | {1}   |  1355.9 |       2184 |        3506 | 22-MAR-11 16:48:29.317016 -04:00
    16651 | t       | {1}   | 1684.72 |       2184 |        3506 | 22-MAR-11 16:51:45.927906 -04:00
  (4 rows)

Index Advisor inserted the first two rows resulting from the following two
queries run using the pg_advise_index utility:

  SELECT * FROM t WHERE a = 500;
  SELECT * FROM t WHERE b < 1000;

Note: The value of 3442 in column backend_pid identifies these results as
      coming from the same session with process ID 3442.

      The value of 1 in column attrs in the first row indicates that the
      hypothetical index is on the first column of the table (column a of
      table t).

      The value of 2 in column attrs in the second row indicates that the
      hypothetical index is on the second column of the table (column b of
      table t).

Index Advisor inserted the last two rows resulting from the following two
queries run in an edb-psql session:

   edb=# EXPLAIN SELECT * FROM t WHERE a < 10000;
                                             QUERY PLAN
   ----------------------------------------------------------------------------------------------
    Seq Scan on t  (cost=0.00..1693.00 rows=10105 width=8)
      Filter: (a < 10000)
    Result  (cost=0.00..337.10 rows=10105 width=8)
      One-Time Filter: '===[ HYPOTHETICAL PLAN ]==='::text
      ->  Index Scan using "<hypothetical-index>:1" on t  (cost=0.00..337.10 rows=10105 width=8)
            Index Cond: (a < 10000)
   (6 rows)

   edb=# EXPLAIN SELECT * FROM t WHERE a = 100;
                                          QUERY PLAN
   ----------------------------------------------------------------------------------------
    Seq Scan on t  (cost=0.00..1693.00 rows=1 width=8)
      Filter: (a = 100)
    Result  (cost=0.00..8.28 rows=1 width=8)
      One-Time Filter: '===[ HYPOTHETICAL PLAN ]==='::text
      ->  Index Scan using "<hypothetical-index>:3" on t  (cost=0.00..8.28 rows=1 width=8)
            Index Cond: (a = 100)
   (6 rows)

The values in the benefit column of table index_advisor_log are calculated
according to the following formula:

  benefit = (normal execution cost) - (execution cost with hypothetical index)

So for example, the value of the benefit column for the last row of table
index_advisor_log is calculated from the query plan of
"EXPLAIN SELECT * FROM t WHERE a = 100;" as follows:

  benefit = (Seq Scan on t cost) - (Index Scan using <hypothetical-index>)

  benefit = 1693.00 - 8.28

  benefit = 1684.72


Query View index_recommendations
--------------------------------

View index_recommendations shows the calculated metrics and the
CREATE INDEX statements to create the recommended indexes for all sessions
whose results are currently in the index_advisor_log table.

Note: You can delete rows from the index_advisor_log table at any time if
      you no longer have the need to view results of certain queries.

Using the same content of the index_advisor_log table shown previously,
the index_recommendations view displays the following:

  edb=# SELECT * FROM index_recommendations;
   backend_pid |                                 show_index_recommendations
  -------------+---------------------------------------------------------------------------------------------
          3442 | create index idx_t_a on t(a);/* size: 2184 KB, benefit: 1684.72, gain: 0.771392654586624 */
          3442 | create index idx_t_b on t(b);/* size: 2184 KB, benefit: 1655.52, gain: 0.758021539820856 */
          3506 | create index idx_t_a on t(a);/* size: 2184 KB, benefit: 3040.62, gain: 1.39222666981456 */
  (3 rows)

Each row shows the metrics for a recommended index determined in a session
whose process ID is displayed in the backend_pid column.

The metrics for each recommended index are displayed by the following fields:

  create index - SQL statement to create the recommended index

  size         - Maximum estimated size of the recommended index

  benefit      - Calculated total benefit of using this index based upon
                 the sum of the individual benefit values across all queries
                 given within the session that benefit from this recommended
                 index

  gain         - Metric quantifying the total benefit to be gained by the
                 index tempered by the maximum possible size of this index
                 calculated across all queries given within the session that
                 benefit from this recommended index

Thus within each session, the results of all queries that benefit from the
same recommended index are combined to produce one set of metrics per
recommended index reflected in the fields named benefit and gain.

The formulas for the fields are as follows:

  size    = MAX(index size of all queries)

  benefit = SUM(benefit of each query)

  gain    = SUM(benefit of each query) / MAX(index size of all queries)

So for example, using the following query results from the process with
backend_pid 3506:

   reloid | relname | attrs | benefit | index_size | backend_pid |            timestamp
  --------+---------+-------+---------+------------+-------------+----------------------------------
    16651 | t       | {1}   |  1355.9 |       2184 |        3506 | 22-MAR-11 16:48:29.317016 -04:00
    16651 | t       | {1}   | 1684.72 |       2184 |        3506 | 22-MAR-11 16:51:45.927906 -04:00


The metrics displayed from view index_recommendations for backend_pid 3506 are the following:

   backend_pid |                                 show_index_recommendations
  -------------+---------------------------------------------------------------------------------------------
          3506 | create index idx_t_a on t(a);/* size: 2184 KB, benefit: 3040.62, gain: 1.39222666981456 */


The metrics from the view are calculated as follows:

  benefit = (benefit from 1st query) + (benefit from 2nd query)  

  benefit = 1355.9 + 1684.72

  benefit = 3040.62

    and

  gain = ((benefit from 1st query) + (benefit from 2nd query)) / MAX(index size of all queries)

  gain = (1355.9 + 1684.72) / MAX(2184, 2184)

  gain = 3040.62 / 2184

  gain = 1.39223

The gain metric is useful for comparing the relative advantage of the
different recommended indexes derived during a given session. The larger
the gain value, the better the cost effectiveness derived from the index
weighed against the possible disk space consumption of the index.


============
RESTRICTIONS
============

1. Index Advisor does not consider Index Only scans; it does consider Index
   scans when making recommendations.

2. If the query's WHERE clause contains transformation of the column values,
   this is not considered when finding candidate indexes. Effectively, the
   index field in the recommendations will not be any kind of expression;
   the field will be a simple column name.

3. Index Advisor does not consider if the table has child tables resulting
   from the usage of table inheritance. If a query references the parent
   table, Index Advisor does not make any index recommendations on the
   child tables.

4. Restoration of a pg_dump backup file that includes the index_advisor_log
   table or any tables for which indexing recommendations were made and
   stored in the index_advisor_log table, may result in "broken links"
   between the index_advisor_log table and the restored tables referenced
   by rows in the index_advisor_log table.

   This occurs if tables referenced in the index_advisor_log table are
   recreated as part of the restore process (for example, if the backup is
   restored to a new database).

   The recreated tables are assigned new Object Identification numbers
   (OIDs), however, the OIDs referencing these tables in the rows of the
   index_advisor_log table are not updated with the new OIDs. The result
   is that the show_index_recommendations function and the
   index_recommendations view return no rows from a query.

   If it is necessary to display the recommendations made prior to the
   backup, you can replace the old OIDs in the reloid column of the
   index_advisor_log table with the new OIDs of the referenced tables
   using the SQL UPDATE statement:

     UPDATE index_advisor_log SET reloid = new_oid WHERE reloid = old_oid;


For more information about Index Advisor, please see the Postgres Plus Advanced
Server Performance and Scalability Guide, available from the EnterpriseDB website 
at:

    http://www.enterprisedb.com/products-services-training/products/documentation


-------------------------------------------
Copyright (c) 2011-2013 EnterpriseDB Corporation
