restore point and tablespace level flashback
Last week, we had a application level migration rehearsal on billing test database. There would be huge changes on tables. Unfortunately there was no rollback plan on application side. So we decided to use creating restore point before migration. If migration failed we would back to this point. Before we set db_recovery_file_dest,db_recovery_file_dest_size parameter.
SQL> alter system set db_recovery_file_dest_size=4096G scope=both;
System altered.
SQL> alter system set db_recovery_file_dest=+RECO scope=both;
System altered.
SQL> create restore point before_rpe guarantee flashback database;
Restore point created.
So we started the run many sql script on database and after finish, we tried to test rollback scenario.
SQL> shutdown immediate
SQL> startup mount;
SQL> flashback database to restore point before_rpe;
ORA-38753: Cannot flashback data file 314; no flashback log data.
ORA-01110: data file 312: ‘+xxxxDATA00/xxxx/datafile/t_itusers.900.759343549′
ORA-38753: Cannot flashback data file 316; no flashback log data.
ORA-01110: data file 312: ‘+xxxxDATA00/xxxx/datafile/t_itusers.989.759343535′
So it gave the error and it said these datafiles sould be taken offline to run restore.
SQL> alter database datafile ‘+xxxxDATA00/bspr/datafile/t_itusers.900.759343549′ offline;
SQL> alter database datafile ‘+xxxxDATA00/bspr/datafile/t_itusers.989.759343535′ offline;
SQL> flashback database to restore point before_rpe;
Flashback complete.
Elapsed: 00:39:05.84
On alert log:
FLASHBACK DATABASE TO RESTORE POINT BEFORE_RPE
Sat Jan 28 15:14:25 2012
SUCCESS: diskgroup xxxxSSD00 was mounted
Sat Jan 28 15:14:25 2012
NOTE: dependency between database xxxx and diskgroup resource ora.xxxSSD00.dg is established
Sat Jan 28 15:14:35 2012
Flashback Restore Start
Sat Jan 28 15:22:26 2012
db_recovery_file_dest_size of 4194304 MB is 5.77% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Sat Jan 28 15:53:55 2012
Flashback Restore Complete
Flashback Media Recovery Start
started logmerger process
Sat Jan 28 15:54:14 2012
Parallel Media Recovery started with 96 slaves
Sat Jan 28 15:54:14 2012
Warning: Datafile 314 (+xxxxDATA00/xxxx/datafile/t_itusers.900.759343549) is offline during full database recovery and will not be recovered
Warning: Datafile 316 (+xxxxDATA00/xxxx/datafile/t_itusers.989.759343535) is offline during full database recovery and will not be recovered
Sat Jan 28 15:54:14 2012
Flashback Media Recovery Log +RECO/xxxx/archivelog/2012_01_28/thread_1_seq_97550.703.773716497
Sat Jan 28 15:54:15 2012
Incomplete Recovery applied until change 11164480619507 time 01/28/2012 01:08:08
Sat Jan 28 15:54:15 2012
Flashback Media Recovery Complete
Completed: FLASHBACK DATABASE TO RESTORE POINT BEFORE_RPE
SQL> select name from v$tablespace where flashback_on=’NO’;
NAME
——————————
T_ITUSERS
interesting point in here; When the database is open mode you can enable/disable database level flashback but only disable in tablespace level.
SQL> alter database flashback on;
Database altered.
SQL> alter database flashback off;
Database altered.
But you can disable tablespace level flashback but cannot enable it. You can do that on mount mode
SQL> alter tablespace T_ITUSERS flashback off;
Tablespace altered.
SQL> alter tablespace T_ITUSERS flashback on;
alter tablespace T_ITUSERS flashback on
*
ERROR at line 1:
ORA-01126: database must be mounted in this instance and not open in any
instance
Summary:
Be careful when you use restore point to sure that all tablespaceses are on flashback mode.
duplicate target database for standby backup location
We are installing standby database for DR side. One of our database is about 10TB and generating daily 500-700Gb archive logs.
Taking tape backups, duplicating them, sending to another city and then finishing restore takes 3-3.5 days long around.
in that period 1.5 TB archive had been generated. Sending these archive logs or taking incremental backup and send via ftp a little bit headache.
So we decided to take clone copy (we said BCV) of that database, take compresses full backup set, ftp to another city and restore.
On production side:
SQL> alter database begin backup
On clone side:
/SYMCLI/V7.1.2/bin/symclone -sid 78 -f /oracle/BCV/xxxpre0_BCV_XXX_devices recreate -noprompt
/SYMCLI/V7.1.2/bin/symclone -sid 78 -f /oracle/BCV/xxxpre0_BCV_XXX_devices activate -noprompt
On production side:
SQL> alter database end backup
On clone side:
SQL> startup mount;
So we can start taking backup.
[xxx:yyyy]@/oracle/dba/rman$ cat rman.cold.DB.full.sh
#!/usr/local/bin/bash
# TODO: Before running this script
# sqlplus /nolog
# conn / as sysdba
# shutdown immediate
# startup mount
# alter system set large_pool_size=1G;
export ORACLE_HOME=/oracle/product/11.2.0/db
export NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P9
export ORACLE_SID=DB
export NLS_DATE_FORMAT=”DD-MM-YYYY HH24:MI:SS”
rman target=/ nocatalog cmdfile=rman.cold.DB.full.rman log=rman.cold.DB.full.log
[xxx:yyyy]@/oracle/dba/rman$ cat rman.cold.DB.full.rman
configure controlfile autobackup on;
show all;
run{
allocate channel c1 type disk format ‘/fra/rman_backup/db_%p_%s_%T.bkp’;
allocate channel c2 type disk format ‘/fra/rman_backup/db_%p_%s_%T.bkp’;
#64 channel opened here
backup as compressed backupset database SECTION SIZE=48G TAG ‘DB_FULL_20120218′;
backup current controlfile TAG ‘DB_FULL_20120218_CTL’;
backup spfile TAG ‘DB_FULL_20120218_SPF’;
release channel c1;
release channel c2;
#64 channel released here
};
exit;
so backup was finished about 7 hours with 64 channel and 100% cpu usage with total size about 1.5TB
So we can ftp the backup files to DR side with 10 parallel ftp session taking about 8 hours.
Now we can start to restore that backup set for standby. we got initDB.ora parameter from prod and we did some changes on it.
SQL>startup nomount
[oracle@XXXdr00:DB]/oracle/dba/rman$ vi rman.restore.DB.rman
run
{
allocate auxiliary channel c1 type disk ;
allocate auxiliary channel c2 type disk ;
#64 channel was used
duplicate target database for standby
backup location=’/orabackup1/db_full/’
pfile=/oracle/product/11.2.0/db/dbs/initDB.ora
nofilenamecheck;
release channel c1;
release channel c2;
#64 channel was released.
};
exit
Restore operation had finished about 6 hours. So totally 23 hours standby db had been ready for applying archives;
SQL>alter database recover managed standby database disconnect from session;
So our stand by database was ready and sync with production about 6 hours later.
resumable_timeout
Today, one of our test guy said that their process was hang up and waited for a log time. Process is was very simple: parse file, do some calculation and insert oracle database.
when we look up with top command on unix side
Resources PID: 2034, rlh PPID: 29202 euid: 119 User:xxxx00
——————————————————————————–
CPU Usage (util): 0.0 Log Reads : na Wait Reason : JOBCL
User/Nice/RT CPU: 0.0 Log Writes: na Total RSS/VSS : 3.34gb/ 3.49gb
System CPU : 0.0 Phy Reads : 0 Traps / Vfaults: 0/ 0
we look explanation of wait reason JOBCL HP-UX guide says:
Definition of JOBCL as you know is waiting for tracing resume, debug resume, or job control start.
A background process incurs this block when attempting to write to a terminal set with “stty tostop”.
[oracle@server:DB]tusc -p 8723
( Attached to process 8723 (“rlh”) [32-bit] )
[8723] read(0, 0x7fff6948, 1) ………………………. [sleeping]
process is sleeping
it seems that process was stopped and wating for resume command. We looked at what happened on oracle side.
SQL>select s.sid
from v$process p, v$session s
where p.spid= 2034
and s.paddr=p.addr;
–1555
SQL>select event from v$session_wait sid=1555;
EVENT
—————————————————————-
statement suspended wait error to be cleared
“Statement Suspended, Wait Error To Be Cleared” Wait Event (Doc ID 761848.1)
So yes RESUMABLE_TIMEOUT initialization parameter is set to non zero value and when we looked at ASM diskgroup we see it was exhausted and no space on it.
After dropping some files, process could continue without problem.