Wednesday, May 6, 2009

Understanding the SCN(System Change Number) in Oracle

Understanding the SCN

In order to understand how Oracle performs recovery, it’s first necessary to understand Oracle’s SCN in terms of the various places where it can be stored and how it’s used for instance and media recovery.

The SCN is an internal number maintained by the database management system (DBMS) to log changes made to a database. The SCN increases over time as changes are made to the database by Structured Query Language (SQL). By understanding how the SCN is used, you can understand how Oracle recovery works. Oracle9i enables you to examine the current SCN using the following SQL:

SQL> select dbms_flashback.get_system_change_number from dual;

Whenever an application commits a transaction, the log writer process (LGWR) writes records from the redo log buffers in the System Global Area (SGA) to the online redo logs on disk. LGWR also writes the transaction’s SCN to the online redo log file. The success of this atomic write event determines whether your transaction succeeds, and it requires a synchronous (wait-until-completed) write to disk.


Note: The need for a synchronous write upon commit is one of the reasons why the online redo log can become a bottleneck for applications and why you should commit as infrequently as is practical. In general, Oracle writes asynchronously to the database datafiles for performance reasons, but commits require a synchronous write because they must be guaranteed at the time they occur.

SCN and Checkpoints:

A checkpoint occurs when all modified database buffers in the Oracle SGA are written out to datafiles by the database writer (DBWn) process. The checkpoint process (CKPT) updates all datafiles and control files with the SCN at the time of the checkpoint and signals DBWn to write out the blocks. A successful checkpoint guarantees that all database changes up to the checkpoint SCN have been recorded in the datafiles. As a result, only those changes made after the checkpoint need to be applied during recovery. Checkpoints occur automatically as follows:

  • Whenever a redo log switch takes place
  • Whenever the time set by the LOG_CHECKPOINT_TIMEOUT initialization parameter is reached
  • Whenever the amount of redo written reaches the number of bytes associated with the LOG_CHECKPOINT_INTERVAL

Typically, LOG_CHECKPOINT_INTERVAL is chosen so that checkpoints only occur on log switches. Oracle stores the SCN associated with the checkpoint in four places: three of them in the control file and one in the datafile header for each datafile.

The System Checkpoint SCN:

After a checkpoint completes, Oracle stores the system checkpoint SCN in the control file. You can access the checkpoint SCN using the following SQL:

SQL> select checkpoint_change# from v$database;


CHECKPOINT_CHANGE#
--------------------
292767

The Datafile Checkpoint SCN:

After a checkpoint completes, Oracle stores the SCN individually in the control file for each datafile. The following SQL shows the datafile checkpoint SCN for a single datafile in the control file:

SQL> select name,checkpoint_change# from v$datafile where name like '%users01%';

NAME CHECKPOINT_CHANGE#
----------------------------------- --------------------
/u02/oradata/OMFD1/users01.dbf 292767

The Start SCN:

Oracle stores the checkpoint SCN value in the header of each datafile. This is referred to as the start SCN because it is used at instance startup time to check if recovery is required. The following SQL shows the checkpoint SCN in the datafile header for a single datafile:

SQL> select name,checkpoint_change# from v$datafile_header where name like '%users01%';

NAME CHECKPOINT_CHANGE#
----------------------------------- --------------------
/u02/oradata/OMFD1/users01.dbf 292767

The Stop SCN:

The stop SCN is held in the control file for each datafile. The following SQL shows the stop SCN for a single datafile when the database is open for normal use:

SQL> select name,last_change# from v$datafile where name like '%users01%';

NAME LAST_CHANGE#
----------------------------------- ------------
/u02/oradata/OMFD1/users01.dbf

During normal database operation, the stop SCN is NULL for all datafiles that are online in read-write mode. SCN Values while the Database Is Up Following a checkpoint while the database is up and open for use, the system checkpoint in the control file, the datafile checkpoint SCN in the control file, and the start SCN in each datafile header all match. The stop SCN for each datafile in the control file is NULL. SCN after a Clean Shutdown After a clean database shutdown resulting from a SHUTDOWN IMMEDIATE or SHUTDOWN NORMAL of the database, followed by STARTUP MOUNT, the previous queries on v$database and v$datafile return the following:

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
--------------------
293184

SQL> select name,checkpoint_change#,last_change# from v$datafile where name like '%user%';

NAME CHECKPOINT_CHANGE# LAST_CHANGE#
----------------------------------- -------------------- --------------
/u02/oradata/OMFD1/users01.dbf 293184 293184


SQL> select name,checkpoint_change# from v$datafile_header where name like '%users01%';

NAME CHECKPOINT_CHANGE#
----------------------------------- --------------------
/u02/oradata/OMFD1/users01.dbf 293184

During a clean shutdown, a checkpoint is performed and the stop SCN for each datafile is set to the start SCN from the datafile header. Upon startup, Oracle checks the start SCN in the file header with the datafile checkpoint SCN. If they match, Oracle checks the start SCN in the datafile header with the datafile stop SCN in the control file. If they match, the database can be opened because all block changes have been applied, no changes were lost on shutdown, and therefore no recovery is required on startup. After the database is opened, the datafile stop SCN in the control file once again changes to NULL to indicate that the datafile is open for normal use.

SCN after an Instance Crash

The previous example showed the behavior of the SCN after a clean shutdown. To demonstrate the behavior of the checkpoints after an instance crash, the following SQL creates a table (which performs an implicit commit) and inserts a row of data into it without a commit:
create table x(x number) tablespace users;

insert into x values(100);

If the instance is crashed by using SHUTDOWN ABORT, the previous queries on v$database and v$datafile return the following after the database is started up in mount mode:

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
--------------------
293185

SQL> select name,checkpoint_change#,last_change# from v$datafile where name like '%users01%';

NAME CHECKPOINT_CHANGE# LAST_CHANGE#
----------------------------------- -------------------- --------------
/u02/oradata/OMFD1/users01.dbf 293185


SQL> select name,checkpoint_change# from v$datafile_header where name like '%users01%';

NAME CHECKPOINT_CHANGE#
----------------------------------- --------------------
/u02/oradata/OMFD1/users01.dbf 293185

In this case, the stop SCN is not set, which is indicated by the NULL value in the LAST_CHANGE# column. This information enables Oracle, at the time of the next startup, to determine that the instance crashed because the checkpoint on shutdown was not performed. If it had been performed, the LAST_CHANGE# and CHECKPOINT_CHANGE# values would match for each datafile as they did during a clean shutdown. If an instance crashes at shutdown, then instance crash recovery is required the next time the instance starts up.

Recovery from an Instance Crash

Upon the next instance startup that takes place after SHUTDOWN ABORT or a DBMS crash, the Oracle DBMS detects that the stop SCN for datafiles is not set in the control file during startup. Oracle then performs crash recovery. During crash recovery, Oracle applies redo log records from the online redo logs in a process referred to as roll forward to ensure that all transactions committed before the crash are applied to the datafiles. Following roll forward, active transactions that did not commit are identified from the rollback segments and are undone before the blocks involved in the active transactions can be accessed. This process is referred to as roll back. In our example, the following transaction was active but not committed at the time of the SHUTDOWN ABORT, so it needs to be rolled back:

SQL> insert into x values(100);

After instance startup, the X table exists, but remains empty. Instance recovery happens automatically at database startup without database administrator (DBA) intervention. It may take a while because of the need to apply large amounts of outstanding redo changes to data blocks for transactions that completed and those that didn’t complete and require roll back.

Recovery from a Media Failure

Up until this point, the checkpoint start SCN in the datafile header has always matched the datafile checkpoint SCN number held in the control file. This is reasonable because during a checkpoint, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are both updated, along with the system checkpoint SCN. The following SQL shows the start SCN from the datafile header and datafile checkpoint SCN from the control file for the same file:

SQL> select 'controlfile' "SCN location",name,checkpoint_change# from v$datafile where name like '%users01%'
union
select 'file header',name,checkpoint_change# from v$datafile_header where name like '%users01%';

SCN location NAME CHECKPOINT_CHANGE#
-------------- ----------------------------------- --------------------
controlfile /u02/oradata/OMFD1/users01.dbf 293188
file header /u02/oradata/OMFD1/users01.dbf 293188

Unlike the v$datafile view, there is no stop SCN column in the v$datafile_header view because v$datafile_header is not used at instance startup time to indicate that an instance crash occurred. However, the v$datafile_header does provide the Oracle DBMS with the information it requires to perform media recovery. At instance startup, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are checked for equality. If they don’t match, it is a signal that media recovery is
required.

For example, media recovery is required if a media failure has occurred and the original datafile has been replaced with a backup copy. In this case, the start SCN in the backup copy is less than the checkpoint SCN value in the control file, and Oracle requests archived redo logs—generated at the time of previous log switches—in order to reapply the changes required to bring the datafile up to the current point in time.

In order to recover the database from a media failure, you must run the database in ARCHIVELOG mode to ensure that all database changes from the online redo logs are stored permanently in archived redo log files. In order to enable ARCHIVELOG mode, you must run the command ALTERDATABASE ARCHIVELOG when the database is in a mounted state.

You can identify files that need recovery after you have replaced a datafile with an older version by starting the instance in mount mode and running the following SQL:

SQL> select file#,change# from v$recover_file;

FILE# CHANGE#
---------- ----------
4 313401

In this example, file 4 is the datafile in the USERS tablespace. By reexecuting the previous SQL to display the datafile checkpoint SCN in the control file and the start SCN in the datafile header, you can see that the start SCN is older due to the restore of the backup datafile that has taken place:

SQL> select 'controlfile' "SCN location",name,checkpoint_change#
from v$datafile where name like '%users01%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where name like '%users01%';

SCN location NAME CHECKPOINT_CHANGE#
-------------- ----------------------------------- --------------------
controlfile /u02/oradata/OMFD1/users01.dbf 313551
file header /u02/oradata/OMFD1/users01.dbf 313401
If you were to attempt to open the database, you would receive errors like the following:
ORA-01113: file 4 needs media recovery
ORA-01110: datafile 4: '/u02/oradata/OMFD1/users01.dbf'
You can recover the database by issuing RECOVER DATABASE from SQL*Plus while the database is in a mounted state. If the changes needed to recover the database to the point in time before the crash are in an archived redo log, then you will be prompted to accept the suggested name:
ORA-00279: change 313401 generated at 11/10/2001 18:50:23 needed for thread
ORA-00289: suggestion : /u02/oradata/OMFD1/arch/T0001S0000000072.ARC
ORA-00280: change 313401 for thread 1 is in sequence #72

Specify log: {=suggested | filename | AUTO | CANCEL}

If you respond to the prompt using AUTO, Oracle applies any archived redo logs it needs, followed by any necessary changes in the online redo logs, to bring the database right up to the last committed transaction before the media failure that caused the requirement for the restore.

So far, we’ve considered recovery scenarios where the goal is to recover the database to the most recent transaction. This is known as complete recovery. The RECOVER DATABASE command has several other options that enable you to recover from a backup to a point in time before the most recent transaction by rolling forward and then stopping the application of the redo log changes at a specified point. This is known as incomplete recovery. You can specify a time or an SCN as the recovery point. For example,

recover database until time '2001-11-10:18:52:00';
recover database until change 313459;

Before you perform incomplete recovery, it’s recommended that you restore a complete database backup first. After incomplete recovery, you must open the mounted database with ALTER DATABASE OPEN RESETLOGS. This creates a new incarnation of the database and clears the contents of the existing redo logs to make sure they can’t be applied.

Recovery from a Media Failure Using a Backup Control File

In the previous example, we had access to a current control file at the time of the media failure. This means that none of the start SCN values in the datafile headers exceeded the system checkpoint SCN number in the control file. To recap, the system checkpoint number is given by the following:

SQL> select checkpoint_change# from v$database;

You might be wondering why Oracle needs to maintain the last system checkpoint value in the control file as well as checkpoint SCNs in the control file for each datafile (as used in the previous example). There are two reasons for this. The first is that you might have read-only tablespaces in your database. In this case, the database checkpoint SCN increases, and the checkpoint SCN for the datafiles in the read-only tablespace remains frozen in the control file.

The following SQL report output shows a database with a read-write tablespace (USERS) and read-only tablespace (TEST). The start SCN in the file header and the checkpoint SCN in the control file for TEST are less than the system checkpoint value. Once a tablespace is read only, checkpoints have no effect on the files in it. The other read-write tablespace has checkpoint values that match the system checkpoint:

SCN location NAME CHECKPOINT_CHANGE#
-------------------- ---------------------------------- ----------------
controlfile SYSTEM checkpoint 355390
file header /u02/oradata/OD2/users01.dbf 355390
file in controlfile /u02/oradata/OD2/users01.dbf 355390
file header /u02/oradata/OD2/test01.dbf 355383
file in controlfile /u02/oradata/OD2/test01.dbf 355383

The second reason for the maintenance of multiple checkpoint SCNs in the control file is that you might not have a current control file available at recovery time. In this case, you need to restore an earlier control file before you can perform a recovery. The system checkpoint in the control file may indicate an earlier change than the start SCN in the datafile headers.

The following SQL shows an example where the system checkpoint SCN and datafile checkpoint SCN indicate an earlier change than the start SCN in the datafile header:

SQL> select 'controlfile' "SCN location",'SYSTEM checkpoint' name,checkpoint_change#
from v$database
union
select 'file in controlfile',name,checkpoint_change#
from v$datafile where name like 'users01%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where name like '%users01%';

SCN location NAME CHECKPOINT_CHANGE#
------------------- ------------------------------ ------------------
controlfile SYSTEM checkpoint 333765
file header /u02/oradata/OD2/users01.dbf 355253
file in controlfile /u02/oradata/OD2/users01.dbf 333765

If try you to recover a database in the usual way in this situation, Oracle detects that the control file is older than some of the datafiles, as indicated by the checkpoint SCN values in the datafile headers, and reports the following message:

SQL> recover database
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

If you want to proceed with recovery in this situation, you need to indicate to Oracle that a noncurrent control file—possibly containing mismatches in the SCN values identified by the previous error messages—is about to be specified for recovery by using the following command:
recover database using BACKUP CONTROLFILE;


Happy Learning !

16 comments:

  1. Great post , really helped clear lot of doubts

    Krishnan

    ReplyDelete
  2. Thank you! You really have a talent to explain difficult things easy.

    Alexander, junior DBA from Minsk, Belarus

    ReplyDelete
  3. great post ...sir how many scn generated in oracle..when we create table command....

    ReplyDelete
  4. Really a good post with clear explaination

    ReplyDelete
  5. thanks for the post and your time, really helped me a lot

    ReplyDelete
  6. This is the only blog from where i was able to understand the recovery.

    ReplyDelete
  7. Thanks a Lot !! Thanks for your time and this blog

    ReplyDelete
  8. Such a great post ..thank you very much

    ReplyDelete
  9. Great post .. Thank u very much

    ReplyDelete
  10. hi ,

    thanks for the excellent article but still i have one doubt

    why the system change number is different than checkpoint_change# ?

    ReplyDelete
  11. Excellent Work!!
    Have you also written about other oracle concepts??

    ReplyDelete

Thanks for you valuable comments !