wtorek, 17 maja 2011

When a database stuck

OS: AIX 5300-11
Oracle 10.2.0.4

I encountered interesting situation on one of the database.
I checked wait events and I was shocked what I saw.



Shit, Scheiße, Merda. But don`t panic. Let`s have a look into the alert.log


Fri May 13 15:02:07 2011
*************************************************************
Unable to allocate flashback log of 9197 blocks from
current recovery area of size 751619276800 bytes.
Recovery Writer (RVWR) is stuck until more space is
available in the recovery area.
Unable to write Flashback database log data because the
recovery area is full, presence of a guaranteed
restore point and no reusable flashback logs.
Use ALTER SYSTEM SET db_recovery_file_dest_size command
to add space. DO NOT manually remove flashback log files
to create space.
*************************************************************
Fri May 13 15:10:02 2011
ALTER SYSTEM ARCHIVE LOG
Fri May 13 15:10:02 2011
Thread 1 cannot allocate new log, sequence 41083
Checkpoint not complete
  Current log# 4 seq# 41082 mem# 0: /r550/oradata/XXX/redo04.log
  Current log# 4 seq# 41082 mem# 1: /r551/oradata/XXX/redo04.log
Fri May 13 17:59:31 2011
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=113
System State dumped to trace file /a550/app/oracle/admin/XXX/udump/XXX_ora_1310918.trc
Fri May 13 17:59:32 2011
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=99
Sat May 14 06:51:19 2011
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=115
System State dumped to trace file /a550/app/oracle/admin/XXX/udump/XXX_ora_1966228.trc

Hmm, it`s not good. It is monday morning and a last entry in the alert log is stamped on Sunday morning.
The database is really stuck.
It is a problem with the flash recovery area because the database cannot allocate a new flashback log.
Let`s check usage of the flash recovery area.


sys@XXX> select * from v$flash_recovery_area_usage;  

FILE_TYPE    PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE                   0                         0               0
ONLINELOG                     0                         0               0
ARCHIVELOG                    0                         0               0
BACKUPPIECE                 .17                         0               8
IMAGECOPY                     0                         0               0
FLASHBACKLOG              99.83                         0            8477


sys@XXX> show parameter db_recovery

NAME                        TYPE     VALUE
--------------------------- ------- -----------------------------
db_recovery_file_dest       string  /flashfs/XXX/XXX/flashback
db_recovery_file_dest_size          big integer 700G

It`s clear that flashback logs are occupying 700G space. But why ? Maybe a restore point ?

sys@XXX> select * from v$RESTORE_POINT;

       SCN DATABASE_INCARNATION# GUA STORAGE_SIZE TIME       NAME
---------- --------------------- --- ------------ ---------------------
 152048985                     3 YES   7.5066E+11 05-APR-11  START_0504

sys@XXX> select STORAGE_SIZE/1024/1024/1024 GB from v$RESTORE_POINT;

       GB
---------
699.53228

Nice, somebody created the guarantee restore point called START_0504 (more than one month ago) and the database has to keep 700G of the flashback logs.

After short discussion we decided to delete this restore point because we don`t need it anymore.

sys@XXX> DROP RESTORE POINT START_0504;

Let`s check the alert log.

Mon May 16 10:08:48 2011
Guaranteed restore point START_0504 dropped 

But the space was not reclaimed.

oraaead[XXX](/flashfs/XXX/XXX/flashback/XXX/flashback)$ du -gs .
699.60 .

oraaead[XXX](/flashfs/XXX/XXX/flashback/XXX/flashback)ll | wc -l          
    8485

Let`s decrease size of the flash recovery area.

sys@XXX> alter system set db_recovery_file_dest_size=10G;

System altered.

And after a short while I checked the file system again.

oraaead[aead2](/flashfs/XXX/XXX/flashback/XXX/flashback)$ du -gs .
8.70    .

oraaead[aead2](/flashfs/XXX/XXX/flashback/XXX/flashback)$ ll | wc -l
      65


700G of the flashback logs files were gone. Bye, bye and never come back.

sys@XXX> select * from v$flash_recovery_area_usage;
  
FILE_TYPE    PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE                   0                         0               0
ONLINELOG                     0                         0               0
ARCHIVELOG                    0                         0               0
BACKUPPIECE                1.16                         0               8
IMAGECOPY                     0                         0               0
FLASHBACKLOG                8.7                      7.86              62

If you create a guarantee restore point don`t forget to delete if you don`t need anymore.

Brak komentarzy:

Prześlij komentarz