czwartek, 2 czerwca 2011

MIssing data in v$backup_async_io and v$backup_sync_io

OS: AIX 5300-06 
Oracle: 9.2.0.8



I hit a strange problem. I had a problem with a performance of RMAN backup and according to Oracle Metalink note
Known RMAN Performance Problems [ID 247611.1]

RMAN incremental backup performance degrades over time in NOCATALOG mode
Version affected : 8.1 9.0.1
Version fixed    : 8174 9014 9201
Affects          : Target database
Diagnosis        : Many entries in X$KSFQP
Workaround    : SQL> alter system set events 'immediate trace name ksfqp_limit level 1';

I executed on RMAN session

RMAN> sql "alter system set events ''immediate trace name ksfqp_limit level 1''";

sql statement: alter system set events ''immediate trace name ksfqp_limit level 1''
After that I wanted to check a backup performance using the views v$backup_async_io and 
v$backup_sync_io but the datas weren`t up to date.

SQL> select max(open_time),max(close_time) from v$backup_async_io;
MAX(OPEN_TIME)      MAX(CLOSE_TIME)
------------------- -------------------
2011-05-31 10:26:39 2011-05-31 10:26:39

SQL> select max(open_time),max(close_time) from v$backup_sync_io;
MAX(OPEN_TIME)      MAX(CLOSE_TIME)
------------------- -------------------
2011-05-31 10:26:39 2011-05-31 10:26:39
Where are the data after 2011-05-31 10:26:39 ?
Let`s check an alert.log

...
Tue May 31 10:26:38 2011
Starting control autobackup
Tue May 31 10:28:01 2011
Control autobackup written to SBT_TAPE device
        comment 'API Version 2.0,MMS Version 5.4.1.0',
        media '2'
        handle 'c-1601303025-20110531-26'
Tue May 31 10:31:19 2011
OS Pid: 2113700 executed alter system set events 'immediate trace name ksfqp_limit level 1'
Tue May 31 10:41:01 2011
Thread 1 advanced to log sequence 59271
  Current log# 3 seq# 59271 mem# 0: /r655/oradata/XXX/redo03.log
  Current log# 3 seq# 59271 mem# 1: /r656/oradata/XXX/redo03.log
Tue May 31 10:41:01 2011
ARCH: Evaluating archive   log 2 thread 1 sequence 59270
ARCH: Beginning to archive log 2 thread 1 sequence 59270
Creating archive destination LOG_ARCHIVE_DEST_1: '/a655/app/oracle/admin/XXX/arch/1_59270.arch'
Tue May 31 10:41:01 2011
ARC0: Evaluating archive   log 2 thread 1 sequence 59270
ARC0: Unable to archive log 2 thread 1 sequence 59270
      Log actively being archived by another process
Tue May 31 10:41:01 2011
ARCH: Completed archiving  log 2 thread 1 sequence 59270
Tue May 31 10:41:58 2011
Starting control autobackup
Tue May 31 10:43:41 2011 
...

Indeed at 10:31:19 I set an event and it was stamped in the alert log.
On the metalink I found the note: Data missing from V$backup_async_io [ID 277605.1]


Symptoms
After taking Rman backup you do not see any data populated in v$backup_aysnc_io.

Cause
You have set this event in init.ora or session level 
alter system set events 'immediate trace name ksfqp_limit level 1

Fix
This Event limits the maximum number of rows in v$backup_[a]sync_io table setting it to 1, 
maximum number of rows is 0; by setting it to 2, 
maximum number of rows is 2 by setting it to 3, 
maximum number of rows is 4 (ie 2 power of (3-1)) and so on.



I fix the problem by executing


SQL> alter system set events 'immediate trace name ksfqp_limit level 20';
System altered.

How to speed up a RMAN backup ?

OS: AIX 5300-06 
Oracle: 9.2.0.8


I had a problem with RMAN backup on one of the database.
The database has 20GB but RMAN backup takes about 1 hour during a night.
It is too long if I compare a size of the database.
RMAN uses Tivoli Data Protection for Oracle and goes to the tape.
I ran the backup manually with the tracing and the debugging options.



orapdvs[XXX](/aXXX/app/oracle/XXX)$ rman target / log=rman1.log debug trace=rman1_debug.trc


RMAN> run {
      allocate channel c1 type 'sbt_tape' DEBUG=5 TRACE=5 parms 
      'ENV=(TDPO_OPTFILE=/a655/app/oracle/etc/tdp/pdvs.opt)';
      backup incremental level 0 cumulative database 
      include current controlfile format 'DB_%u_%p_%c' TAG BARTEK;


In the rman1_debug.trc I found



...
 +3755  DBGRPC:   krmxrpc: xc=4568322960 starting longrunning RPC #28 to target: DBMS_BACKUP_RESTORE.BACKUPPIECECREATE
 +3756  DBGRPC:   krmxr: xc=4568322960 started long running rpc
 +3757  DBGRPC:   krmxpoq: xc=4568322960, action="0000027 FINISHED", col_l=16, ind=0, sid=20
 +3758  DBGRPC:   krmxr: callback returned TRUE, skipping sleep
 +3759  DBGRPC:   krmxpoq: xc=4568322960, action="0000028 STARTED", col_l=15, ind=0, sid=20
 +3760  DBGRPC:   krmxr: sleeping for 1 seconds
 +3761  DBGRPC:   krmxpoq: xc=4568322960, action="0000028 STARTED", col_l=15, ind=0, sid=20
 +3762  DBGRPC:   krmxr: sleeping for 2 seconds
 +3763  DBGRPC:   krmxpoq: xc=4568322960, action="0000028 STARTED", col_l=15, ind=0, sid=20
 +3764  DBGRPC:   krmxr: sleeping for 4 seconds
 +3765  DBGRPC:   krmxpoq: xc=4568322960, action="0000028 STARTED", col_l=15, ind=0, sid=20
 +3766  DBGRPC:   krmxr: sleeping for 8 seconds
 +3767  DBGRPC:   krmxpoq: xc=4568322960, action="0000028 STARTED", col_l=15, ind=0, sid=20
 +3768  DBGRPC:   krmxr: sleeping for 10 seconds
...


On the metalink I found the note: RMAN Hangs Using Tivoli TDP [ID 221212.1]
In this note Oracle writes about parameter called TDPO_NUM_BUFFERS.
In the IBM Redbook "Backing Up Oracle Using Tivoli Storage Management" is written:



Tivoli Data Protection for Oracle now includes an option to transfer Oracle 
data to the TSM server using multiple buffers to help alleviate network 
bottlenecks. 
This option specifies the number of buffers TDP for Oracle uses to support 
multi-buffering. By entering a value, multi buffering is activated.
We measured this options with runtimes of a full database backup with values 
of 2, 4, 8 and without this option. We achieved the best results with only 2 
buffers.


I decided to test using different configurations: without specifying TDPO_NUM_BUFFERS and with values from 2 to 5.
I ran rman incremental level 0 backup on one tape channel.
These results are very interesting.


TDPO_NUM_BUFFERS1st time 2nd time 3rd time
not specified 18 min
16 sec
17 min
25 sec
17 min
30 sec
 2 6 min
36 sec
5 min
26 sec
5 min
46 sec
 3 4 min
45 sec
4 min
36 sec
4 min
35 sec
 4 5 min
45 sec
6 min
46 sec
5 min
5 sec
 5 6 min
56 sec
5 min
16 sec
5 min
5 sec


Time of the backup was decreased about 3 times. Nice improvement.