Thursday, November 29, 2012

A Typical Case of SQL Tuning using 11g SQL Baseline



Running Swingbench on TEST Server found performance tanked.
Found this frequent execute SQL had a plan change and taking massive Temp space and IO

SQL> @check_sql_plan_change
Enter value for sql_id: 7hk2m2702ua0g
old   6: where sql_id = '&sql_id'
new   6: where sql_id = '7hk2m2702ua0g'

   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
         7      1 28-NOV-12 04.10.03.425 PM      7hk2m2702ua0g      3283867725       76,049         .005           20.4
         8      1 28-NOV-12 04.20.09.222 PM      7hk2m2702ua0g                      118,691         .005           20.4
         9      1 28-NOV-12 04.37.36.000 PM      7hk2m2702ua0g                       13,106         .014           20.4
        10      1 28-NOV-12 04.48.33.912 PM      7hk2m2702ua0g                       43,318         .007           20.4
        11      1 28-NOV-12 05.00.35.170 PM      7hk2m2702ua0g                      121,633         .004           20.3
        12      1 28-NOV-12 05.10.37.015 PM      7hk2m2702ua0g                       56,663         .004           20.3
       114      1 29-NOV-12 10.10.14.781 AM      7hk2m2702ua0g      2125236239        1,400       69.475        5,146.7
       115      1 29-NOV-12 10.20.15.130 AM      7hk2m2702ua0g                        6,284       65.791        7,295.8
       116      1 29-NOV-12 10.30.16.251 AM      7hk2m2702ua0g                          700       71.816       13,954.5

LOAD Plans to SQLSET from AWR

declare
baseline_ref_cursor DBMS_SQLTUNE.SQLSET_CURSOR;
begin open
baseline_ref_cursor for select VALUE(p) from table(DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(1,132,NULL,NULL,NULL,NULL,NULL,NULL,NULL,'ALL')) p;
DBMS_SQLTUNE.LOAD_SQLSET('SWING', baseline_ref_cursor);
end;

LOAD Plans to Baseline from SQLSET

DECLARE 
  my_10gplans PLS_INTEGER;
BEGIN
  my_10gplans := DBMS_SPM.LOAD_PLANS_FROM_SQLSET( sqlset_name => 'SWING' );
END;

Checking loaded plans

SQL> SELECT sql_handle, plan_name, enabled, accepted , ELAPSED_TIME FROM   dba_sql_plan_baselines where PARSING_SCHEMA_NAME='SOE';

SQL_HANDLE                     PLAN_NAME                      ENA ACC ELAPSED_TIME
------------------------------ ------------------------------ --- --- ------------
SQL_7aa3962c9ae2db34           SQL_PLAN_7p8wq5kdf5qtn641e1b0c YES YES   2249855466
SQL_7aa3962c9ae2db34           SQL_PLAN_7p8wq5kdf5qtnc4b22a15 YES YES   6.5511E+11

Disable unwanted plan, enable and fix desired plan


DECLARE
     i NATURAL;
    BEGIN
      i := dbms_spm.alter_sql_plan_baseline('SQL_7aa3962c9ae2db34', attribute_name => 'enabled', attribute_value=>'NO');
      dbms_output.put_line(i);
    END;


SQL> set serveroutput on
SQL> DECLARE
  2   i NATURAL;
  3  BEGIN
  4    i := dbms_spm.alter_sql_plan_baseline('SQL_7aa3962c9ae2db34', 'SQL_PLAN_7p8wq5kdf5qtn641e1b0c', attribute_name => 'enabled', attribute_value=>'YES');
  5    dbms_output.put_line(i);
  6  END;
  7 
  8 
  9  /
PL/SQL procedure successfully completed.

SQL> SELECT sql_handle, plan_name, enabled, accepted , ELAPSED_TIME, fixed FROM   dba_sql_plan_baselines where PARSING_SCHEMA_NAME='SOE';

SQL_HANDLE                     PLAN_NAME                      ENA ACC ELAPSED_TIME FIX
------------------------------ ------------------------------ --- --- ------------ ---
SQL_7aa3962c9ae2db34           SQL_PLAN_7p8wq5kdf5qtn641e1b0c YES YES   2249855466 NO
SQL_7aa3962c9ae2db34           SQL_PLAN_7p8wq5kdf5qtnc4b22a15 NO  YES   6.5511E+11 NO

Now Swingbench using correct plan and perform as expected.

Thursday, October 25, 2012

AIX ASM Diskgroup reporting wrong Lun size


  We discovered this strange problem while trying to convert one of our big databases from file system to ASM. After we assigned a 1TB lun to AIX and created a diskgroup on top of it. ASM reporting wrong disk size on it. ASM only be able to detect around 100G out of 1TB.
  After a few poke around to make sure we didn't make any mistake provision the storage to the OS. I did some research on Metalink. And fair enough, I found this is a known bug on AIX ASM

Bug 9495887 AIX: ASM does not recognize correct diskgroup size for large disks

-------------------------------------------------------------------------------
Device         Size (GB)  Paths  Vol Name       Vol Id   XIV Id   XIV Host     
-------------------------------------------------------------------------------
/dev/hdisk4    1135.7     5/5    hedata16     113      7825812  heproddb102


NAME            PATH              GROUP_NUMBER   TOTAL_MB    FREE_MB READS WRITES 
--------------- ----------------- ------------ ---------- ---------- ----- ------ 
TEST_0000       /dev/rhdisk4                 5     101920     101800    60     10 

The workaround suggested was to create the disk group with specific size like following

SQL> create diskgroup DATA external redundancy disk '/dev/rhdisk4' size 1135G;

Frankly it's quite surprising we still hit such basis bugs on AIX ASM even after it was released 3 years. 1TB disk is hardly a large disk nowadays, I guess AIX is just such an unpopular OS for Oracle installation and have very little customer base. And I can totally understand why.