Friday, October 21, 2011

Understand Oracle Locking

Overview

In multi-user systems, many users may update the same information at the same time. Locking allows only one user to update a particular data block; another person cannot modify the same data.
The basic idea of locking is that when a user modifies data through a transaction, that data is locked by that transaction until the transaction is committed or rolled back. The lock is held until the transaction is complete - this known as data concurrency.
The second purpose of locking is to ensure that all processes can always access (read) the original data as they were at the time the query began (uncommited modification), This is known as read consistency.
Although locks are vital to enforce database consistency, they can create performance problems. Every time one process issues a lock, another user may be shut out from processing the locked row or table. Oracle allows to lock whatever resources you need - a single row, many rows, an entire table, even many tables. But the larger the scope of the lock, the more processes you potentially shut out.

Oracle provides two different levels of locking: Row Level Lock and Table Level Lock.

Row-Level Locking

With a row-level locking strategy, each row within a table can be locked individually. Locked rows can be updated only by the locking process. All other rows in the table are still available for updating by other processes. Of course, other processes continue to be able to read any row in the table, including the one that is actually being updated. When other processes do read updated rows, they see only the old version of the row prior to update (via a rollback segment) until the changes are actually committed. This is known as a consistent read.
When a process places a row level lock on a record, what really happens?

First, a data manipulation language (DML) lock is placed over the row. This lock prevents other processes from updating (or locking) the row. This lock is released only when the locking process successfully commits the transaction to the database (i.e., makes the updates to that transaction permanent) or when the process is rolled back.

Next, a data dictionary language (DDL) lock is placed over the table to prevent structural alterations to the table. For example, this type of lock keeps the DBA from being able to remove a table by issuing a DROP statement against the table. This lock is released only when the locking process successfully commits the transaction to the database or when the process is rolled back.

Table-Level Locking

With table-level locking, the entire table is locked as an entity. Once a process has locked a table, only that process can update (or lock) any row in the table. None of the rows in the table are available for updating by any other process. Of course, other processes continue to be able to read any row in the table, including the one that is actually being updated.
How does table-level locking work?
The first DML operation that needs to update a row in a table obtains what's called a Row Share Exclusive lock over the entire table. All other query-only processes needing access to the table are informed that they must use the rollback information for the locking process. The lock is released only when the locking process successfully commits the transaction to the database or when the process is rolled back.

Releasing Locks

Many users believe that they are the only users on the system - at least the only ones who count. Unfortunately, this type of attitude is what causes locking problems. We've often observed applications that were completely stalled because one user decided to go to lunch without having committed his or her changes. Remember that all locking (row or table) will prevent other users from updating information. Every application has a handful of central, core tables. Inadvertently locking such tables can affect many other people in a system.
Many users, and some programmers, don't understand that terminating a process does not always release locks. Switching off your workstation before you go home does not always release locks. Locks are released only when changes are committed or rolled back. A user's action is the only thing that distinguishes between committing, aborting, and rolling back changes. Make it a priority to train your users to commit or roll back all outstanding changes before leaving their current screens.

Modes of Locking

Oracle uses two modes of locking in a multi-user database:

Exclusive lock mode (X) prevents the associates resource from being shared. This lock mode is obtained to modify data. The first transaction to lock a resource exclusively is the only transaction that can alter the resource until the exclusive lock is released.

Share lock mode (S) allows the associated resource to be shared, depending on the operations involved. Multiple users reading data can share the data, holding share locks to prevent concurrent access by a writer (who needs an exclusive lock). Several transactions can acquire share locks on the same resource.

Exclusive Locks

SQL Statement

Mode of Lock

SELECT ... FROM table...
No Lock

INSERT INTO table ...
RX

UPDATE table ...
RX

DELETE FROM table ...
RX

LOCK TABLE table IN ROW EXCLUSIVE MODE
RX

LOCK TABLE table IN SHARE ROW EXCLUSIVE MODE
SRX

LOCK TABLE table IN EXCLUSIVE MODE
X

Share Locks

SQL Statement

Mode of Lock

SELECT ... FROM table FOR UPDATE OF ... RS

LOCK TABLE table IN ROW SHARE MODE RS

LOCK TABLE table IN SHARE MODE S

LOCK TABLE table IN SHARE ROW EXCLUSIVE MODE SRX

RS:   Row Share
RX:   Row Exclusive
S:    Share
SRX: Share Row Exclusive
X:    Exclusive

Description of each Lock Mode

The following sections explain each mode of lock, from least restrictive to most restrictive.

Row Share Table Locks (RS)

A row share table lock (also sometimes called a subshare table lock, SS) indicates that the transaction holding the lock on the table has locked rows in the table and intends to update them. A row share table lock is automatically acquired for a table when one of the following SQL statements is executed:

SELECT ... FROM table ... FOR UPDATE OF ... ;
LOCK TABLE table IN ROW SHARE MODE;
A row share table lock is the least restrictive mode of table lock, offering the highest degree of concurrency for a table.

Permitted Operations:

A row share table lock held by a transaction allows other transactions to:

SELECT (query the table)
INSERT, UPDATE, DELETE
or lock rows concurrently in the same table.
Therefore, other transactions can obtain simultaneous row share, row exclusive, share, and share row exclusive table locks for the same table.

Prohibited Operations:

A row share table lock held by a transaction prevents other transactions from exclusive write access to the same table.

When to Lock with ROW SHARE Mode:

Your transaction needs to prevent another transaction from acquiring an intervening share, share row, or exclusive table lock for a table before the table can be updated in your transaction. If another transaction acquires an intervening share, share row, or exclusive table lock, no other transactions can update the table until the locking transaction commits or rolls back.
Your transaction needs to prevent a table from being altered or dropped before the table can be modified later in your transaction.

We use the EMP table for the next examples.

     EMPNO ENAME      JOB
---------- ---------- ---------
      7369 Smith      CLERK
      7499 Allen      SALESMAN
      7521 Ward       SALESMAN
      7566 Jones      MANAGER
      7654 Martin     SALESMAN
      7698 Blake      MANAGER
      7782 Clark      MANAGER
      7788 Scott      ANALYST
      7839 King       PRESIDENT
      7844 Turner     SALESMAN
      7876 Adams      CLERK
      7900 James      TEST
      7902 Ford       ANALYST
      7934 Miller     CLERK

Session 1

Session 2

select job from emp
where job = 'CLERK'
   for update of empno;
OK

select job from emp
where job = 'CLERK'
   for update of empno;
Waiting ....

select job from emp
where job = 'MANAGER'
   for update of empno;
OK

lock table emp in share mode;
OK

lock table emp in exclusive mode;
Waiting ....

insert into emp (empno,ename)
values (9999,'Test');
OK

delete from emp where empno = 9999;
OK

delete from emp where empno = 7876;
Waiting .... (Blocked by Session 1)

update emp set job = 'CLIMBER'
where empno = 7876;
Waiting .... (Blocked by Session 1)

A first look about the locking situation can be found in DBA_LOCKS

SELECT SUBSTR(TO_CHAR(session_id),1,5) "SID",
       SUBSTR(lock_type,1,15) "Lock Type",
       SUBSTR(mode_held,1,15) "Mode Held",
       SUBSTR(blocking_others,1,15) "Blocking?"
FROM dba_locks
/

SID   Lock Type       Mode Held       Blocking?
----- --------------- --------------- ---------------
95    Transaction     Exclusive       Blocking      <- This is Session 1
95    DML             Row-S (SS)      Not Blocking
98    DML             Row-X (SX)      Not Blocking <- This is Session 2
98    Transaction     None            Not Blocking
110   Temp Segment    Row-X (SX)      Not Blocking
111   RS              Row-S (SS)      Not Blocking
111   Control File    Row-S (SS)      Not Blocking
111   XR              Null            Not Blocking
112   Redo Thread     Exclusive       Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   Media Recovery Share           Not Blocking
113   PW              Row-X (SX)      Not Blocking

Row Exclusive Table Locks (RX)

A row exclusive table lock (also called a subexclusive table lock, SX) generally indicates that the transaction holding the lock has made one or more updates to rows in the table. A row exclusive table lock is acquired automatically for a table modified by the following types of statements:

INSERT INTO table ... ;
UPDATE table ... ;
DELETE FROM table ... ;
LOCK TABLE table IN ROW EXCLUSIVE MODE;
A row exclusive table lock is slightly more restrictive than a row share table lock.

Permitted Operations:

A row exclusive table lock held by a transaction allows other transactions to
SELECT (query the table)
INSERT, UPDATE, DELETE
or lock rows concurrently in the same table.
Therefore, row exclusive table locks allow multiple transactions to obtain simultaneous row exclusive and row share table locks for the same table.

Prohibited Operations:

A row exclusive table lock held by a transaction prevents other transactions from manually locking the table for exclusive reading or writing. Therefore, other transactions cannot concurrently lock the table using the following statements:
LOCK TABLE table IN SHARE MODE;
LOCK TABLE table IN EXCLUSIVE MODE;

When to Lock with ROW EXCLUSIVE Mode:

This is the Default Locking Behaviour of Oracle.

Session 1

Session 2

update emp
set ename = 'Zahn';
OK

lock table emp in exclusive mode;
Waiting ....

Share Table Locks (S)

A share table lock is acquired automatically for the table specified in the following statement:
LOCK TABLE table IN SHARE MODE;

Permitted Operations:

A share table lock held by a transaction allows other transactions only to

to SELECT (query the table)
to lock specific rows with SELECT ... FOR UPDATE
or to execute LOCK TABLE ... IN SHARE MODE
statements successfully. No updates are allowed by other transactions. Multiple transactions can hold share table locks for the same table concurrently. In this case, no transaction can update the table (even if a transaction holds row locks as the result of a SELECT statement with the FOR UPDATE clause). Therefore, a transaction that has a share table lock can update the table only if no other transactions also have a share table lock on the same table.

Prohibited Operations:

A share table lock held by a transaction prevents other transactions from modifying the same table and from executing the following statements:

LOCK TABLE table IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE table IN EXCLUSIVE MODE;
LOCK TABLE table IN ROW EXCLUSIVE MODE;

When to Lock with SHARE Mode

Your transaction only queries the table, and requires a consistent set of the table's data for the duration of the transaction.
You can hold up other transactions that try to update the locked table, until all transactions that hold SHARE locks on the table either commit or roll back.
Other transactions may acquire concurrent SHARE table locks on the same table, also allowing them the option of transaction-level read consistency.
Caution:
Your transaction may or may not update the table later in the same transaction. However, if multiple transactions concurrently hold share table locks for the same table, no transaction can update the table (even if row locks are held as the result of a SELECT... FOR UPDATE statement). Therefore, if concurrent share table locks on the same table are common, updates cannot proceed and deadlocks are common. In this case, use share row exclusive or exclusive table locks instead.

Session 1

Session 2

lock table emp
in share mode;
OK
update emp
set ename = 'Zahn'
where empno = 7900;
commit;
OK
lock table emp
in share mode;
OK
This and other Transactions have to wait until Session 2 commits the Transaction.
update emp
set ename = 'MÃ¼ller'
where empno = 7900;
Waiting ....

select * from emp;
OK

This and other Transactions have to wait until Session 1 commits the Transaction.

This and other Transactions can
get a Share Lock (Lock Switch).

lock table emp
in share mode;
OK

For example, assume that two tables, emp and budget, require a consistent set of data in a third table, dept. For a given department number, you want to update the information in both of these tables, and ensure that no new members are added to the department between these two transactions.
Although this scenario is quite rare, it can be accommodated by locking the dept table in SHARE MODE, as shown in the following example. Because the dept table is rarely updated, locking it probably does not cause many other transactions to wait long.
LOCK TABLE dept IN SHARE MODE; /* Other Transactions have to wait */
UPDATE emp
SET sal = sal * 1.1
WHERE deptno IN
(SELECT deptno FROM dept WHERE loc = 'DALLAS');

UPDATE budget
SET Totsal = Totsal * 1.1
WHERE deptno IN
(SELECT deptno FROM dept WHERE Loc = 'DALLAS');

COMMIT; /* This releases the lock */

Exclusive Table Locks (X)

An exclusive table lock is the most restrictive mode of table lock, allowing the transaction that holds the lock exclusive write access to the table. An exclusive table lock is acquired for a table as follows:
LOCK TABLE table IN EXCLUSIVE MODE;

Permitted Operations:

Only one transaction can obtain an exclusive table lock for a table. An exclusive table lock permits other transactions only to query the table.

Prohibited Operations:

An exclusive table lock held by a transaction prohibits other transactions from performing any type of DML statement or placing any type of lock on the table.

Be careful to use an EXCLUSIVE lock!

Your transaction requires immediate update access to the locked table. When your transaction holds an exclusive table lock, other transactions cannot lock specific rows in the locked table.
Your transaction also ensures transaction-level read consistency for the locked table until the transaction is committed or rolled back.
You are not concerned about low levels of data concurrency, making transactions that request exclusive table locks wait in line to update the table sequentially.

Session 1

Session 2

lock table emp
in exclusive mode;
OK
update emp
set ename = 'Zahn'
where empno = 7900;
commit;
OK
lock table emp
in exclusive mode;
OK

select * from emp;
OK

This and other Transactions have to wait until Session 1 commits the Transaction.

This and other Transactions cannot
get any other Lock (No Lock Switch).
lock table emp
in share mode;
Waiting ....
lock table emp
in exclusive mode; Waiting ....
update emp
set ename = 'Zahn'
where empno = 7900; Waiting ....

Data Lock Conversion Versus Lock Escalation

A transaction holds exclusive row locks for all rows inserted, updated, or deleted within the transaction. Because row locks are acquired at the highest degree of restrictiveness, no lock conversion is required or performed.
Oracle automatically converts a table lock of lower restrictiveness to one of higher restrictiveness as appropriate. For example, assume that a transaction uses a SELECT statement with the FOR UPDATE clause to lock rows of a table. As a result, it acquires the exclusive row locks and a row share table lock for the table. If the transaction later updates one or more of the locked rows, the row share table lock is automatically converted to a row exclusive table lock.
Lock escalation occurs when numerous locks are held at one level of granularity (for example, rows) and a database raises the locks to a higher level of granularity (for example, table). For example, if a single user locks many rows in a table, some databases automatically escalate the user's row locks to a single table. The number of locks is reduced, but the restrictiveness of what is being locked is increased.
Oracle never escalates locks. Lock escalation greatly increases the likelihood of deadlocks.

Deadlocks

A deadlock can occur when two or more users are waiting for data locked by each other. Deadlocks prevent some transactions from continuing to work. The next example illustrates two transactions in a deadlock.

Example

Session 1

Session 2
Time

update emp set
sal = sal * 1.1
where empno = 7369;

1 row updated.
update emp set
sal = sal * 1.1
where empno = 7934;

ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

update emp set
mgr = 1342
where empno = 7934;

1 row updated.
update emp set
mgr = 1342
where empno = 7369;

Waiting ...

A

B

C

In the example, no problem exists at time point A, as each transaction has a row lock on the row it attempts to update. Each transaction proceeds without being terminated. However, each tries next to update the row currently held by the other transaction. Therefore, a deadlock results at time point B, because neither transaction can obtain the resource it needs to proceed or terminate. It is a deadlock because no matter how long each transaction waits, the conflicting locks are held.

Automatic Deadlock Detection

Oracle performs automatic deadlock detection for enqueue locking deadlocks. Deadlock detection is initiated whenever an enqueue wait times out, if the resource type required is regarded as deadlock sensitive, and if the lock state for the resource has not changed. If any session that is holding a lock on the required resource in an incompatible mode is waiting directly or indirectly for a resource that is held by the current session in an incompatible mode, then a deadlock exists.
If a deadlock is detected, the session that was unlucky enough to find it aborts its lock request and rolls back its current statement in order to break the deadlock. Note that this is a rollback of the current statement only, not necessarily the entire transaction. Oracle places an implicit savepoint at the beginning of each statement, called the default savepoint, and it is to this savepoint that the transaction is rolled back in the first case. This is enough to resolve the technical deadlock. However, the interacting sessions may well remain blocked.

ORA-60 error in ALERT.LOG

An ORA-60 error is returned to the session that found the deadlock, and if this exception is not handled, then depending on the rules of the application development tool, the entire transaction is normally rolled back, and a deadlock state dump written to the user dump destination directory. This, of course, resolves the deadlock entirely. The enqueue deadlocks statistic in V$SYSSTAT records the number of times that an enqueue deadlock has been detected.
select name, value
from v$sysstat
where name = 'enqueue deadlocks';

NAME VALUE
------------------------------------------------------------ ----------
enqueue deadlocks 1

How to avoid Deadlocks

Application developers can eliminate all risk of enqueue deadlocks by ensuring that transactions requiring multiple resources always lock them in the same order. However, in complex applications, this is easier said than done, particularly if an ad hoc query tool is used. To be safe, you should adopt a strict locking order, but you must also handle the ORA-60 exception appropriately. In some cases it may be sufficient to pause for three seconds, and then retry the statement. However, in general, it is safest to roll back the transaction entirely, before pausing and retrying.

Referential Integrity Locks (RI Locks)

With the introduction of automated referential integrity (RI) came a whole new suite of locking problems. What seems at first to be a DBA's blessing can turn out to be an absolute nightmare when the DBA doesn't fully understand the implications of this feature. Why is this so?
RI constraints are validated by the database via a simple SELECT from the dependent (parent) table in question-very simple, very straightforward. If a row is deleted or a primary key is modified within the parent table, all associated child tables need to be scanned to make sure no orphaned records will result. If a row is inserted or the foreign key is modified, the parent table is scanned to ensure that the new foreign key value(s) is valid. If a DELETE CASCADE clause is included, all associated child table records are deleted. Problems begin to arise when we look at how the referential integrity is enforced.
Oracle assumes the existence of an index over every foreign key within a table. This assumption is valid for a primary key constraint or even a unique key constraint but a little presumptuous for every foreign key.

Index or no Index on Foreign Key's ?

If an index exists on the foreign key column of the child table, no DML locks, other than a lock over the rows being modified, are required.
If the index is not created, a share lock is taken out on the child table for the duration of the transaction.
The referential integrity validation could take several minutes or even hours to resolve. The share lock over the child table will allow other users to simultaneously read from the table, while restricting certain types of modification. The share lock over the table can actually block other normal, everyday modification of other rows in that table.
You can use the script: show_missing_fk_index.sql to check unindexed foreign keys:

SQL> start show_missing_fk_index.sql

Please enter Owner Name and Table Name. Wildcards allowed (DEFAULT: %)

eg.: SCOTT, S% OR %
eg.: EMP, E% OR %

Owner <%>: SCOTT
Tables <%>:

Unindexed Foreign Keys owned by Owner: SCOTT

Table Name 1. Column Constraint Name
------------------------ ------------------------ ---------------
EMP DEPTNO FK_EMP_DEPT

What is so dangerous about a Cascading Delete ?

Oracle allows to enhance a referential integrity definition to included cascading deletion. If a row is deleted from a parent table, all of the associated children will be automatically purged. This behavior obviously will affect an application's locking strategy, again circumnavigating normal object locking, removing control from the programmer.
What is so dangerous about a cascading delete? A deleted child table might, in turn, have its own child tables. Even worse, the child tables could have table-level triggers that begin to fire. What starts out as a simple, single-record delete from a harmless table could turn into an uncontrollable torrent of cascading deletes and stored database triggers.
DELETE CASCADE constraints can be found with the following script:

SQL> SELECT OWNER,
       CONSTRAINT_NAME,
       CONSTRAINT_TYPE,
       TABLE_NAME,
       DELETE_RULE
FROM USER_CONSTRAINTS
WHERE DELETE_RULE IS NOT NULL;
CONSTRAINT_NAME                C TABLE_NAME        DELETE_RU
------------------------------ - ----------------- ---------
FK_EMP_DEPT                    R EMP               CASCADE

Blocking Locks

Oracle resolves true enqueue deadlocks so quickly that overall system activity is scarcely affected. However, blocking locks can bring application processing to a standstill. For example, if a long-running transaction takes a shared mode lock on a key application table, then all updates to that table must wait.
There are numerous ways of attempting to diagnose blocking lock situations, normally with the intention of killing the offending session.
Blocking locks are almost always TX (transaction) locks or TM (table) locks . When a session waits on a TX lock, it is waiting for that transaction to either commit or roll back. The reason for waiting is that the transaction has modified a data block, and the waiting session needs to modify the same part of that block. In such cases, the row wait columns of V$SESSION can be useful in identifying the database object, file, and block numbers concerned, and even the row number in the case of row locks. V$LOCKED_OBJECT can then be used to obtain session information for the sessions holding DML locks on the crucial database object. This is based on the fact that sessions with blocking TX enqueue locks always hold a DML lock as well, unless DML locks have been disabled.
It may not be adequate, however, to identify a single blocking session, because it may, in turn, be blocked by another session. To address this requirement, Oracle's UTLLOCKT.SQL script gives a tree-structured report showing the relationship between blocking and waiting sessions. Some DBAs are loath to use this script because it creates a temporary table, which will block if another space management transaction is caught behind the blocking lock. Although this is extremely unlikely, the same information can be obtained from the DBA_WAITERS view if necessary. The DBA_WAITERS view is created by Oracle's catblock.sql script.
Some application developers attempt to evade blocking locks by preceding all updates with a SELECT FOR UPDATE NOWAIT or SELECT FOR UPDATE SKIP LOCKED statement. However, if they allow user interaction between taking a sub-exclusive lock in this way and releasing it, then a more subtle blocking lock situation can still occur. If a user goes out to lunch while holding a sub-exclusive lock on a table, then any shared lock request on the whole table will block at the head of the request queue, and all other lock requests will queue behind it.
Diagnosing such situations and working out which session to kill is not easy, because the diagnosis depends on the order of the waiters. Most blocking lock detection utilities do not show the request order, and do not consider that a waiter can block other sessions even when it is not actually holding any locks.

Lock Detection Scripts

The following scripts can be used to track and identify blocking locks. The scripts shows the following lock situation.

Session 1 Session 2

select empno
from emp for update of empno;
update emp set ename = 'MÃ¼ller'
where empno = 7369;

This script shows actual DML-Locks (incl. Table-Name), WAIT = YES means
that users are waiting for a lock.

WAI OSUSER PROCESS LOCKER T_OWNER OBJECT_NAME   PROGRAM
--- ------- -------- ------- -------- ------------- --------------
NO zahn    8935     SCOTT   -        Record(s)     sqlplus@akira
YES zahn    8944     SCOTT   -        Record(s)     sqlplus@akira
NO zahn    8935     SCOTT   SCOTT    EMP           sqlplus@akira
NO zahn    8944     SCOTT   SCOTT    EMP           sqlplus@akira

This script show users waiting for a lock, the locker and the SQL-Command they are waiting for a lock, the osuser, schema and PIDs are shown as well.

Current Lock-Waits

OS_LOCKER LOCKER_SCHEMA LOCKER_PID OS_WAITER WAITER_SCHEMA WAITER_PID
---------- -------------- ---------- ----------- --------------- ----------
zahn SCOTT 8935 zahn SCOTT 8944

SQL_TEXT_WAITER
--------------------------------------------------------------------------
TX: update emp set ename = 'MÃ¼ller' where empno = 7369

This is the original Oracle script to print out the lock wait-for graph in a tree structured fashion. This script prints the sessions in the system that are waiting for locks, and the locks that they are waiting for. The printout is tree structured. If a sessionid is printed immediately below and to the right of another session, then it is waiting for that session. The session ids printed at the left hand side of the page are the ones that everyone is waiting for (Session 96 is waiting for session 88 to complete):
WAITING_SESSION   LOCK_TYPE    MODE_REQUESTED MODE_HELD  LOCK_ID1  LOCK_ID2
----------------- ------------ -------------- ---------- --------- --------
88                None
   96             Transaction  Exclusive      Exclusive  262144    3206
The lock information to the right of the session id describes the lock that the session is waiting for (not the lock it is holding). Note that this is a script and not a set of view definitions because connect-by is used in the implementation and therefore a temporary table is created and dropped since you cannot do a join in a connect-by.
This script has two small disadvantages. One, a table is created when this script is run. To create a table a number of locks must be acquired. This might cause the session running the script to get caught in the lock problem it is trying to diagnose. Two, if a session waits on a lock held by more than one session (share lock) then the wait-for graph is no longer a tree and the conenct-by will show the session (and any sessions waiting on it) several times.

Distributed Transactions

For distributed transactions, Oracle is unable to distinguish blocking locks and deadlocks, because not all of the lock information is available locally. To prevent distributed transaction deadlocks, Oracle times out any call in a distributed transaction if it has not received any response within the number of seconds specified by the _DISTRIBUTED_LOCK_TIMEOUT parameter. This timeout defaults to 60 seconds. If a distributed transaction times out, an ORA-2049 error is returned to the controlling session. Robust applications should handle this exception in the same way as local enqueue deadlocks.
select name,value
from v$parameter
where name = 'distributed_lock_timeout';

NAME VALUE
----------------------------- ------
distributed_lock_timeout 60

ITL Entry Shortages

There is an interested transaction list (ITL) in the variable header of each Oracle data block. When a new block is formatted for a segment, the initial number of entries in the ITL is set by the INITRANS parameter for the segment. Free space permitting, the ITL can grow dynamically if required, up to the limit imposed by the database block size, or the MAXTRANS parameter for the segment, whichever is less.
Every transaction that modifies a data block must record its transaction identifier and the rollback segment address for its changes to that block in an ITL entry. (However, for discrete transactions, there is no rollback segment address for the changes.) Oracle searches the ITL for a reusable or free entry. If all the entries in the ITL are occupied by uncommitted transactions, then a new entry will be dynamically created, if possible.
If the block does not have enough internal free space (24 bytes) to dynamically create an additional ITL entry, then the transaction must wait for a transaction using one of the existing ITL entries to either commit or roll back. The blocked transaction waits in shared mode on the TX enqueue for one of the existing transactions, chosen pseudo-randomly. The row wait columns in V$SESSION show the object, file, and block numbers of the target block. However, the ROW_WAIT_ROW# column remains unset, indicating that the transaction is not waiting on a row-level lock, but is probably waiting for a free ITL entry.
The most common cause of ITL entry shortages is a zero PCTFREE setting. Think twice before setting PCTFREE to zero on a segment that might be subject to multiple concurrent updates to a single block, even though those updates may not increase the total row length. The degree of concurrency that a block can support is dependent on the size of its ITL, and failing that, the amount of internal free space. Do not, however, let this warning scare you into using unnecessarily large INITRANS or PCTFREE settings. Large PCTFREE settings compromise data density and degrade table scan performance, and non-default INITRANS settings are seldom warranted.
One case in which a non-default INITRANS setting is warranted is for segments subject to parallel DML. If a child transaction of a PDML transaction encounters an ITL entry shortage, it will check whether the other ITL entries in the block are all occupied by its sibling transactions and, if so, the transaction will roll back with an ORA-12829 error, in order to avoid self-deadlock. The solution in this case is to be content with a lower degree of parallelism, or to rebuild the segment with a higher INITRANS setting. A higher INITRANS value is also needed if multiple serializable transactions may have concurrent interest in any one block.

Check ITL Waits

The following SQL-Statement shows the number of ITL-Waits per table (Interested Transaction List). INITRANS and/or PCTFREE for those tables is to small (could also be that MAXTRANS is too small). Note that STATISTICS_LEVEL must be set to TYPICAL or ALL, MAXTRANS has been desupported in Oracle 10g and now is always 255 (maximum).
select name,value
from v$parameter
where name = 'statistics_level';
NAME                                 VALUE
------------------------------------ -----------
statistics_level                     TYPICAL
TTITLE "ITL-Waits per table (INITRANS to small)"
set pages 1000
col owner format a15 trunc
col object_name format a30 word_wrap
col value format 999,999,999 heading "NBR. ITL WAITS"
--
select owner,
       object_name||' '||subobject_name object_name,
       value
from v$segment_statistics
where statistic_name = 'ITL waits'
and value > 0
order by 3,1,2;
--
col owner clear
col object_name clear
col value clear
ttitle off
/

Conclusion

Exclusive Locks lock a resource exclusively, Share Locks can be acquired by more than one Session as long as the other Session holding a Share Lock have no open Transaction. A Share Lock can be "switched" from one Session to another.
Application developers can eliminate the risk of deadlocks by ensuring that transactions requiring multiple resources always lock them in the same order.
A DELETE CASCADE can start out as a simple, single-record delete from a harmless table could turn into an uncontrollable torrent of cascading deletes and stored database triggers.
Blocking locks are almost always TX (transaction) locks or TM (table) locks. Oracle always performs locking automatically to ensure data concurrency, data integrity, and statement-level read consistency. Usually, the default locking mechanisms should not be overriden.
For distributed transactions, Oracle is unable to distinguish blocking locks and deadlocks.
The most common cause of ITL entry shortages is a zero PCTFREE setting.

HAPPY LEARNING!
AJITHPATHIYIL

Rollback Of Single Transaction

SQL> select local_tran_id from dba_2pc_pending; 

LOCAL_TRAN_ID

----------------------

1.1306666

29.1398766

21.996344

27.1267023

30.1222781

4.1280743

13.928667

1.469932

33.554319

2.502494

8.263605

LOCAL_TRAN_ID

----------------------

56.19.234600

58.12.278189

66.9.233351

7.32.277783

15 rows selected.

SQL> Rollback Force '26.30.1222781';

Rollback complete.

SQL> select local_tran_id from dba_2pc_pending where  local_tran_id='26.30.1222781';

LOCAL_TRAN_ID

----------------------

26.30.1222781

SQL> execute dbms_transaction.purge_lost_db_entry('26.30.1222781');

PL/SQL procedure successfully completed.

SQL> select local_tran_id from dba_2pc_pending where  local_tran_id='26.30.1222781';

no rows selected

SQL> select name from v$database;

NAME

---------

AJITHDB

Thursday, October 13, 2011

ORACLE PROCESSES, SESSIONS AND CONNECTIONS

In linux/unix systems a process is a physical process or thread which can be seen with "ps" command for example, There are many types of processes in Oracle, Background processes like

ajithpathiyil:/export/home/oracle>>ps -ef|grep ora_
oracle 103 1 0 Oct 02 ? 45:01 ora_lmon_AJITHDB
oracle 89 1 0 Oct 02 ? 27:30 ora_pmon_AJITHDB
oracle 105 1 0 Oct 02 ? 41:55 ora_lmd0_AJITHDB
oracle 159 1 0 Oct 02 ? 29:27 ora_lgwr_AJITHDB
oracle 161 1 0 Oct 02 ? 9:59 ora_ckpt_AJITHDB
oracle 163 1 0 Oct 02 ? 1:56 ora_smon_AJITHDB
oracle 165 1 0 Oct 02 ? 0:02 ora_reco_AJITHDB
oracle 791 1 0 Oct 02 ? 7:00 ora_arc0_AJITHDB
..
..
..
ajithpathiyil:/export/home/oracle>>

etc..... Meanwhil user processes are like dedicated servers or shared server (multi-threaded server -- aka MTS -- configuration)

A connection is a "physical circuit", a pathway to a database. You can be connected to a database yet have 0 or 1 or MORE sessions going on that connection. We can see that with sqlplus, consider (single user system here, its all about me)

ajithpathiyil:/export/home/oracle>>$ ps -auxww |grep oracleora920

ajithpathiyil:/export/home/oracle>>$ sqlplus /nolog

SQL*Plus: Release 9.2.0.1.0 - Production on Sat Sep 28 10:36:03 2002

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

SQL> !ps -auxww |grep oracleora920
ajithpathiyil 19971 0.0 0.1 2196 916 pts/1 S 10:36 0:00 /bin/bash -c ps -auxww
grep oracleora920
ajithpathiyil 19973 0.0 0.1 1736 564 pts/1 S 10:36 0:00 grep oracleora920
SQL>

no process, no nothing

SQL> connect /
Connected.

SQL> !ps -auxww
grep oracleora920
ora920 19974 1.5 2.2 230976 11752 ? S 10:36 0:00 oracleora920 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
ajithpathiyil 19975 0.0 0.1 2196 916 pts/1 S 10:36 0:00 /bin/bash -c ps -auxww
grep oracleora920
ajithpathiyil 19977 0.0 0.1 1736 564 pts/1 S 10:36 0:00 grep oracleora920

got my process now...

SQL> disconnect
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production

With the Partitioning, OLAP and Oracle Data Mining options

JServer Release 9.2.0.1.0 - Production

SQL> !ps -auxww |grep oracleora920

ora920 19974 0.6 2.3 230976 11876 ? S 10:36 0:00 oracleora920 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
ajithpathiyil 19978 0.0 0.1 2196 916 pts/1 S 10:36 0:00 /bin/bash -c ps -auxww
grep oracleora920
ajithpathiyil 19980 0.0 0.1 1736 564 pts/1 S 10:36 0:00 grep oracleora920

SQL> select * from dual;

SP2-0640: Not connected

still have my process, but no session, the message is a little "misleading". Technically -- I have a connection, I don't have a session further, autotrace in sqlplus can be used to show that you can have

a) a connection
b) that uses a single process
c) to service two sessions:

SQL> select username from v$session where username is not null;

USERNAME
------------------------------
AJITHPATHIYIL

SQL>

one session, ME

SQL> select username, program from v$process;

USERNAME PROGRAM
--------------- ------------------------------------------------
PSEUDO
ora920 oracle@ajithpathiyil.home.com (PMON)
ora920 oracle@ajithpathiyil.home.com (DBW0)
ora920 oracle@ajithpathiyil.home.com (LGWR)
ora920 oracle@ajithpathiyil.home.com (CKPT)
ora920 oracle@ajithpathiyil.home.com (SMON)
ora920 oracle@ajithpathiyil.home.com (RECO)
ora920 oracle@ajithpathiyil.home.com (CJQ0)
ora920 oracle@ajithpathiyil.home.com (QMN0)
ora920 oracle@ajithpathiyil.home.com (S000)
ora920 oracle@ajithpathiyil.home.com (D000)
ora920 oracle@ajithpathiyil.home.com (ARC0)
ora920 oracle@ajithpathiyil.home.com (ARC1)
ajithpathiyil oracle@ajithpathiyil.home.com (TNS V1-V3)

14 rows selected.

SQL>

you can see all of the backgrounds and my dedicated server...

SQL> set autotrace on statistics;

Autotrace for statistics uses ANOTHER session so it can query up the stats for your CURRENT session without impacting the STATS for that session!

ajithpathiyil:/export/home/oracle> select username from v$session where username is not null;

USERNAME
------------------------------
AJITHPATHIYIL
AJITHPATHIYIL

see, two sessions but....

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
418 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

SQL> select username, program from v$process;

USERNAME PROGRAM
--------------- ------------------------------------------------
PSEUDO
ora920 oracle@ajithpathiyil.home.com (PMON)
ora920 oracle@ajithpathiyil.home.com (DBW0)
ora920 oracle@ajithpathiyil.home.com (LGWR)
ora920 oracle@ajithpathiyil.home.com (CKPT)
ora920 oracle@ajithpathiyil.home.com (SMON)
ora920 oracle@ajithpathiyil.home.com (RECO)
ora920 oracle@ajithpathiyil.home.com (CJQ0)
ora920 oracle@ajithpathiyil.home.com (QMN0)
ora920 oracle@ajithpathiyil.home.com (S000)
ora920 oracle@ajithpathiyil.home.com (D000)
ora920 oracle@ajithpathiyil.home.com (ARC0)
ora920 oracle@ajithpathiyil.home.com (ARC1)
ajithpathiyil oracle@ajithpathiyil.home.com (TNS V1-V3)

14 rows selected.

SQL>
same 14 processes...

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
1095 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed

SQL>

A connection is a physical circuit between you and the database.

A connection might be one of many types -- most popular begin DEDICATED server and SHARED server.

Zero, one or more sessions may be established over a given connection to the database as show above with sqlplus.

A process will be used by a session to execute statements.
Sometimes there is a one to one relationship between CONNECTION->SESSION->PROCESS (eg: a normal dedicated server connection).
Sometimes there is a one to many from connection to sessions (eg: like autotrace, one connection, two sessions, one process).

A process does not have to be dedicated to a specific connection or session however, for example when

Using shared server (MTS), your SESSION will grab a process from a pool of processes in order to execute a statement.

When the call is over, that process is released back to the pool of processes.

Happy Learning!

Ajithpathiyil

Friday, September 30, 2011

RAC Cheatsheet-2

Opatch in RAC

===============

. All-Node Patch

. Shutdown all Oracle instances on all nodes

. Apply the patch to all nodes

. Bring all nodes up

. Minimum downtime

. Shutdown the Oracle instance on node 1

. Apply the patch to the Oracle instance on node 1

. Shutdown the Oracle instance on node 2

. Apply the patch to the Oracle instance on node 2

. Shutdown the Oracle instance on node 3

. At this point, instances on nodes 1 and 2 can be brought up

. Apply the patch to the Oracle instance on node 3

. Startup the Oracle instance on node 3

. Rolling patch (no downtime)

. Shutdown the Oracle instance on node 1

. Apply the patch to the Oracle instance on node 1

. Start the Oracle instance on node 1

. Shutdown the Oracle instance on node 2

. Apply the patch to the Oracle instance on node 2

. Start the Oracle instance on node 2

. Shutdown the Oracle instance on node 3

. Apply the patch to the Oracle instance on node 3

. Start the Oracle instance on node 3

5 - How to determine if a patch is a "rolling patch" or not?

- 9i or 10gR1: (run)

$ opatch query -is_rolling

Opatch will ask the patch location and then will inform if the patch is or

not a "rolling patch"

- 10gR2: (run)

$ opatch query -all | grep rolling

Patching one node at time

The Opatch strategies discussed above (All-Node, Min. Down-Time, and Rolling)

presumes that all nodes will be patched at the same time. Additionally,

each node can be patched individually, at different times, using the "-local"

key word, which will patch only the local node.

opatch apply -local

What is OCR and Voting disk

===========================

The voting disk is a file that contains and manages information of all the node memberships

OCR is a file that manages the cluster and RAC configuration.

Version of CRS

---------------

crsctl query crs activeversion

command gives the details about voting disk

-------------------------------------------

crsctl query css votedisk

to backup voting disk

---------------------

dd if=voting_disk_name of=backup_file_name

alpgadbq02sec:/export/users/crsqa $ dd if=/ocrvoteqa/crsqa/oracle/admin/vote_file of=/export/users/crsqa/vote_file2

20000+0 records in

20000+0 records out

alpgadbq02sec:/export/users/crsqa $ ls -lrt /export/users/crsqa/vote_file2

-rw-r--r-- 1 crsqa dbaqa 10240000 Mar 25 09:38 /export/users/crsqa/vote_file2

alpgadbq02sec:/export/users/crsqa $ ls -lrt /ocrvoteqa/crsqa/oracle/admin/vote_file

-rw-r--r-- 1 crsqa dbaqa 10240000 Mar 25 09:39 /ocrvoteqa/crsqa/oracle/admin/vote_file

alpgadbq02sec:/export/users/crsqa $ file /export/users/crsqa/vote_file2

Recover voting disk from file

-----------------------------

Take a backup of all voting disk:

dd if=voting_disk_name of=backup_file_name

The following can be used to restore the voting disk from the backup file created.

dd if=backup_file_name of=voting_disk_name

As CRS User

cinohapdbp01sec:/crslab1/oracle/product/10.2.0/crslab1/cdata/crslab1 $ dd if=/ocrvote/crslab1/oracle/admin/Vote_File of=/crslab1/oracle/product/10.2.0/crslab1/log/vote

crsctl command to add and delete the voting disks

-------------------------------------------------

In 11g online voting disk removal

In 10g it's offline

cinohapdbp01sec # /crslab1/oracle/product/10.2.0/crslab1/bin/crsctl delete css votedisk /ocrvote/crslab1/oracle/admin/Vote_File2 -force

successful deletion of votedisk /ocrvote/crslab1/oracle/admin/Vote_File2.

cinohapdbp01sec #

crsctl delete css votedisk path

crsctl add css votedisk path

use -force option when cluster is down

crsctl add css votedisk path -force

Location of OCR

==================

/cdata/

ocrconfig -showbackup

atlease 2 backup in 4 hr gap

one daily back

one weekly back

ocrconfig -help

Manual backup of CRS

---------------------

As root

Backup=> ocrconfig -export

Bringdown CRS

Restore=> ocrconfig -import

Bring up CRS

cinohapdbp01sec # /crslab1/oracle/product/10.2.0/crslab1/bin/ocrconfig -export /crslab1/oracle/product/10.2.0/crslab1/log/ocr_bak

cinohapdbp01sec # /crslab1/oracle/product/10.2.0/crslab1/bin/ocrconfig -Import /crslab1/oracle/product/10.2.0/crslab1/log/ocr_bak

To change loc

==============

ocrconfig -backuploc

Voting disk details

====================

crsctl query css votedisk

ocrcheck

To check No of instances

==========================

SELECT * FROM V$ACTIVE_INSTANCES;

pfile parameters

=================

ebizdb1.__db_cache_size=377487360

ebizdb2.__db_cache_size=394264576..etc

*.cluster_database_instances=2

*.cluster_database=true

ebizdb2.instance_number=2

ebizdb1.instance_number=1

ebizdb1.undo_tablespace='UNDOTBS1'

ebizdb2.undo_tablespace='UNDOTBS2'

To stop the instance and Db

============================

export ORACLE_SID=ebizdb2

emctl stop dbconsole

srvctl stop instance -d ebizdb -i ebizdb1

srvctl stop asm -n ebizdb1

srvctl stop nodeapps -n ebizdb1

srvctl status database -d ebizdb

srvctl status instance -d ebizdb -i ebizdb1

no of node

===========

$ORA_CRS_HOME/bin/olsnodes -n

cihsnsddb001.sensing.ge.com

To stop clusterware

====================

crsctl check crs

crsctl stop crs

/etc/init.d/init.crs stop

crs_stat -t

pa55word

crs_start -all

crsctl query css votedisk

can i try to delete a node from RAC and add it again

fdisk -l

iscsi-ls

/etc/init.d/o2cb status

OCFS

======

create

mkfs.ocfs2 -b 4k -C 32K -N 4 -L ebizdata_2 /dev/sdd1

mkfs.ocfs2 -b 4K -C 32K -N 4 -L ebizdata_3 /dev/sdc1

mkfs.ocfs2 -b 4K -C 32K -N 4 -L R12data_1 /dev/hda1

mount (cat /etc/fstab)

mount -t ocfs2 -o datavolume,nointr -L "crs" /d01/crsdata/ebizdb

mount -t ocfs2 -o datavolume,nointr -L "ebizdata_3" /d02/oradata/ebizdb

ASM

create

/etc/init.d/oracleasm createdisk VOL2 /dev/sde1

check

openfiler

============

service iscsi-target restart

rac node

========

service iscsi restart

/etc/init.d/oracleasm listdisks

after mount add value in

/etc/fstab

dmesg | sort | grep '^Attached scsi disk'

add "DiscoveryAddress=openfiler-priv:3260" in /etc/iscsi.conf

DiscoveryAddress=openfiler-priv:3260

rpm -Uvh ocfs2-2.6.9-42.ELsmp-1.2.9-1.el4.i686.rpm ocfs2console-1.2.7-1.el4.i386.rpm ocfs2-tools-1.2.7-1.el4.i386.rpm ocfs2-tools-1.2.7-1.el4.i386.rpm

ebizdb1:/root $ ./iscsi-ls-map.sh

Host / SCSI ID SCSI Device Name iSCSI Target Name

---------------- ----------------------- -----------------

2 /dev/sdc ebizdata_3

3 /dev/sdb ebizdata_2

4 /dev/sdd ebizdata_1

5 /dev/sde asm3

6 /dev/sdf asm2

7 /dev/sdh asm1

8 /dev/sdg crs

ebizdb1:/root $

ebizdb1:/root $ fdisk /dev/sdc

Command (m for help): n

Command action

e extended

p primary partition (1-4)

Partition number (1-4): 1

First cylinder (1-1019, default 1):

Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-1019, default 1019):

Using default value 1019

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

ebizdb1:/root $ ./iscsi-ls-map.sh

Host / SCSI ID SCSI Device Name iSCSI Target Name

---------------- ----------------------- -----------------

2 /dev/sdc ebizdata_3

3 /dev/sdb ebizdata_2

4 /dev/sdd ebizdata_1

5 /dev/sde asm3

6 /dev/sdf asm2

7 /dev/sdh asm1

8 /dev/sdg crs

ebizdb1:/root $ mkfs.ocfs2 -b 4K -C 32K -N 4 -L ebizdata_3 /dev/sdc1

COLUMN instance_name FORMAT a13

COLUMN host_name FORMAT a9

COLUMN failover_method FORMAT a15

COLUMN failed_over FORMAT a11

SELECT instance_name , host_name , NULL AS failover_type , NULL AS failover_method , NULL AS failed_over FROM v$instance

UNION

SELECT NULL , NULL , failover_type , failover_method , failed_over

FROM v$session WHERE username = 'SYSTEM';

388577.1-convert to RAC

[root@ebizdb1 ~]# service iscsi restart

Searching for iscsi-based multipath maps

Found 0 maps

Stopping iscsid: [ OK ]

Removing iscsi driver: ERROR: Module iscsi_sfnet is in use

[FAILED]

[root@ebizdb1 ~]#

[root@ebizdb1 ~]# service iscsi restart

Searching for iscsi-based multipath maps

Found 0 maps

Stopping iscsid: iscsid not running

Removing iscsi driver: ERROR: Module iscsi_sfnet is in use

[FAILED]

to install

===========

rpm -Uvh compat*

check

=====

rpm -q compa*

15920685

/dev/hdc6

[root@ebizapp apps_12]# rpm -Uvh compat*

warning: compat-gcc-c++-7.3-2.96.128.i386.rpm: V3 DSA signature: NOKEY, key ID 73307de6

Preparing... ########################################### [100%]

file /usr/lib/libstdc++-2-libc6.1-1-2.9.0.so from install of compat-libstdc++-7.3-2.96.128 conflicts with file from package compat-libstdc++-296-2.96-132.7.2

file /usr/lib/libstdc++-3-libc6.2-2-2.10.0.so from install of compat-libstdc++-7.3-2.96.128 conflicts with file from package compat-libstdc++-296-2.96-132.7.2

[root@ebizapp apps_12]#

runcluvfy.sh stage -post crs -n ebizdb1,ebizdb2 -verbose

./runcluvfy.sh stage -pre dbinst -n ebizdb1,ebizdb2 -r 10gR2 -verbose

Manually deleting the services and clean-up

Removing a Node from a 10g RAC Cluster

======================================

Doc ID: Note:269320.1

=======================

/etc/init.d/init.evmd stop

/etc/init.d/init.evmd disable

/etc/init.d/init.cssd stop

/etc/init.d/init.cssd disable

/etc/init.d/init.crsd stop

/etc/init.d/init.crsd disable

/etc/init.d/init.crs stop

/etc/init.d/init.crs disable

rm -rf /etc/oracle /etc/oraInst.loc /etc/oratab

rm -rf /etc/init.d/init.crsd /etc/init.d/init.crs /etc/init.d/init.cssd /etc/init.d/init.evmd

rm -rf /etc/rc2.d/K96init.crs /etc/rc2.d/S96init.crs etc/rc3.d/K96init.crs \

rm -rf /etc/rc3.d/S96init.crs /etc/rc4.d/K96init.crs /etc/rc4.d/S96init.crs \

rm -rf /etc/rc5.d/K96init.crs /etc/rc5.d/S96init.crs /etc/rc.d/rc0.d/K96init.crs \

rm -rf /etc/rc.d/rc1.d/K96init.crs /etc/rc.d/rc6.d/K96init.crs /etc/rc.d/rc4.d/K96init.crs

cp /etc/inittab.orig /etc/inittab

rm -rf /etc/inittab.crs /etc/inittab.no_crs

rm -rf /tmp/*

rm -rf /tmp/.oracle

rm -rf /usr/local/bin/dbhome /usr/local/bin/oraenv /usr/local/bin/coraenv

rm -rf /var/tmp/.oracle

rm -rf /opt/oracle/*

rm -rf /u03/oracrs/

./runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 CLUSTER_NODES=ebizdb1,ebizdb2

runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 CLUSTER_NODES=ebizdb1,ebizdb2 CRS=false "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=.

[root@ebizdb3 install]# ./rootdelete.sh local nosharedvar nosharedhome

[root@ebizdb1 install]# ./rootdeletenode.sh ebizdb3,3

runInstaller -updateNodeList ORACLE_HOME=/u01/app/crs CLUSTER_NODES=ebizdb1,ebizdb2 CRS=TRUE

Node addition

===============

prepare the hardware similar to existing nodes

Verify Remote Access / User Equivalence

run $ORA_CRS_HOME/oui/bin/addNode.sh

it will ask to run as root for below script

/ORA_CRS_HOME/install/rootaddnode.sh from running node

/ORA_CRS_HOME/root.sh from new node

oraInstRoot.sh creates inventory location if it's not available.

Although Oracle CRS replaces the Oracle Cluster Manager (ORACM) in Oracle9i RAC, it does continue support for the Global Services Daemon (GSD), which in Oracle9i is responsible for communicating with the Oracle RAC database. In Oracle 10g, GSD's sole purpose is to serve Oracle9i clients (such as SRVCTL, Database Configuration Assistant, and Oracle Enterprise Manager). Financially, this is a very positive benefit since one is not bound to buy new client licenses and hardware to support an Oracle 10g database.

To check interfaces between the RAC nodes

----------------------------------------

$ORA_CRS_HOME/bin/oifcfg iflist

cinohapdbp01sec:/export/users/crslab1 $ $ORA_CRS_HOME/bin/oifcfg iflist

ce0 3.112.44.0

ce2 3.112.44.0

ce3 192.168.22.0

ce3 192.168.21.0

ce5 172.17.0.0

ce7 192.168.24.0

ce7 192.168.23.0

ce7 192.168.25.0

ce9 3.24.138.0

cinohapdbp01sec:/export/users/crslab1 $

3.112.45.199 cinohapdbp01sec cinohapdbp01sec.security.ge.com loghost

3.112.45.205 cinohapdbp01sec-vip cinohapdbp01sec-vip.security.ge.com

192.168.21.1 cinohapdbp01sec-priv

3.112.45.202 cinohapdbp02sec cinohapdbp02sec.security.ge.com

3.112.45.206 cinohapdbp02sec-vip cinohapdbp02sec-vip.security.ge.com

192.168.21.2 cinohapdbp02sec-priv

/usr/bin/netstat -f inet

Logfile Location

-----------------

cd $ORA_CRS_HOME/log/node_name

alert.log

evmd/evmd.log

cssd/cssd.log

crsd/crsd.log

cd racg

ora..vip.log

ora..ons.log

ora..gsd.log

To relocate the crs services

----------------------------

crs_relocate

crs_relocate ora.cinohapdbp02sec.vip

Enable debug for particulare RAC services

-----------------------------------------

crsctl debug log res ':5'

to diable

----------

crsctl debug log res 'ora.cinohapdbp02sec.vip'

cinohapdbp02sec:/export/users/crslab1 $ crsctl help

Usage: crsctl check crs - checks the viability of the Oracle Clusterware

crsctl check cssd - checks the viability of Cluster Synchronization Services

crsctl check crsd - checks the viability of Cluster Ready Services

crsctl check evmd - checks the viability of Event Manager

crsctl check cluster [-node ] - checks the viability of CSS across nodes

crsctl set css - sets a parameter override

crsctl get css - gets the value of a Cluster Synchronization Services parameter

crsctl unset css - sets the Cluster Synchronization Services parameter to its default

crsctl query css votedisk - lists the voting disks used by Cluster Synchronization Services

crsctl add css votedisk - adds a new voting disk

crsctl delete css votedisk - removes a voting disk

crsctl enable crs - enables startup for all Oracle Clusterware daemons

crsctl disable crs - disables startup for all Oracle Clusterware daemons

crsctl start crs [-wait] - starts all Oracle Clusterware daemons

crsctl stop crs [-wait] - stops all Oracle Clusterware daemons. Stops Oracle Clusterware managed resources in case of cluster.

crsctl start resources - starts Oracle Clusterware managed resources

crsctl stop resources - stops Oracle Clusterware managed resources

crsctl debug statedump css - dumps state info for Cluster Synchronization Services objects

crsctl debug statedump crs - dumps state info for Cluster Ready Services objects

crsctl debug statedump evm - dumps state info for Event Manager objects

crsctl debug log css [module:level] {,module:level} ... - turns on debugging for Cluster Synchronization Services

crsctl debug log crs [module:level] {,module:level} ... - turns on debugging for Cluster Ready Services

crsctl debug log evm [module:level] {,module:level} ... - turns on debugging for Event Manager

crsctl debug log res [resname:level] ... - turns on debugging for Event Manager

crsctl debug trace css [module:level] {,module:level} ... - turns on debugging for Cluster Synchronization Services

crsctl debug trace crs [module:level] {,module:level} ... - turns on debugging for Cluster Ready Services

crsctl debug trace evm [module:level] {,module:level} ... - turns on debugging for Event Manager

crsctl query crs softwareversion [] - lists the version of Oracle Clusterware software installed

crsctl query crs activeversion - lists the Oracle Clusterware operating version

crsctl lsmodules css - lists the Cluster Synchronization Services modules that can be used for debugging

crsctl lsmodules crs - lists the Cluster Ready Services modules that can be used for debugging

crsctl lsmodules evm - lists the Event Manager modules that can be used for debugging

If necessary any of these commands can be run with additional tracing by adding a 'trace'

argument at the very front. Example: crsctl trace check css

cinohapdbp02sec:/export/users/crslab1

export _USR_ORA_VIP=3.112.45.205

export _USR_ORA_NETMASK=255.255.252.0

export _USR_ORA_IF=ce2

export _CAA_NAME=cinohapdbp01sec

export _USR_ORA_VIP=3.112.45.206

export _USR_ORA_NETMASK=255.255.252.0

export _USR_ORA_IF=ce2

export _CAA_NAME=cinohapdbp02sec

cinohapdbp02sec # /crslab1/oracle/product/10.2.0/crslab1/bin/racgvip check

cinohapdbp02sec # /crslab1/oracle/product/10.2.0/crslab1/bin/racgvip start

checking the cluster configuration

===================================

cinohapdbp02sec:/crslab1/oracle/product/10.2.0/crslab1/log/cinohapdbp02sec/evmd $ srvctl config nodeapps -n cinohapdbp02sec

VIP exists.: /cinohapdbp02sec-vip/3.112.45.206/255.255.252.0/ce2

GSD exists.

ONS daemon exists.

Listener does not exist.

cinohapdbp02sec:/crslab1/oracle/product/10.2.0/crslab1/log/cinohapdbp02sec/evmd $

Doc ID: 283107.1

-----------------

bash-3.00$ ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1

inet 127.0.0.1 netmask ffffff00

ce0: flags=19040843 mtu 1500 index 2

inet 3.112.45.207 netmask fffffc00 broadcast 3.112.47.255

groupname IPMP0

ce2: flags=29040843 mtu 1500 index 3

inet 3.112.45.208 netmask fffffc00 broadcast 3.112.47.255

groupname IPMP0

ce2:1: flags=21000843 mtu 1500 index 3

inet 3.112.45.199 netmask fffffc00 broadcast 3.112.47.255

ce2:2: flags=21000843 mtu 1500 index 3

inet 3.112.45.201 netmask fffffc00 broadcast 3.112.47.255

ce2:3: flags=21040843 mtu 1500 index 3

inet 3.112.45.205 netmask fffffc00 broadcast 3.112.47.255

ce3: flags=1000843 mtu 1500 index 7

inet 192.168.22.1 netmask ffffff00 broadcast 192.168.22.255

ce3:1: flags=1000843 mtu 1500 index 7

inet 192.168.21.1 netmask ffffff00 broadcast 192.168.21.255

ce5: flags=1000843 mtu 1500 index 4

inet 172.17.7.185 netmask ffffc000 broadcast 172.17.63.255

ce7: flags=1000843 mtu 1500 index 8

inet 192.168.24.1 netmask ffffff00 broadcast 192.168.24.255

ce7:1: flags=1000843 mtu 1500 index 8

inet 192.168.23.1 netmask ffffff00 broadcast 192.168.23.255

ce7:2: flags=1000843 mtu 1500 index 8

inet 192.168.25.1 netmask ffffff00 broadcast 192.168.25.255

ce9: flags=1000843 mtu 1500 index 5

inet 3.24.138.216 netmask fffffe00 broadcast 3.24.139.255

bash-3.00$ hostname

cinohapdbp01sec

cinohapdbp02sec # ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1

inet 127.0.0.1 netmask ff000000

ce0: flags=9040843 mtu 1500 index 2

inet 3.112.45.210 netmask fffffc00 broadcast 3.112.47.255

groupname IPMP0

ether 0:14:4f:74:53:64

ce0:1: flags=1000843 mtu 1500 index 2

inet 3.112.45.202 netmask fffffc00 broadcast 3.112.47.255

ce0:2: flags=1040843 mtu 1500 index 2

inet 3.112.45.206 netmask fffffc00 broadcast 3.112.47.255

ce2: flags=69040843 mtu 1500 index 3

inet 3.112.45.211 netmask fffffc00 broadcast 3.112.47.255

groupname IPMP0

ether 0:14:4f:1f:3d:f8

ce3: flags=1000843 mtu 1500 index 6

inet 192.168.22.2 netmask ffffff00 broadcast 192.168.22.255

ether 0:14:4f:1f:3d:f9

ce3:1: flags=1000843 mtu 1500 index 6

inet 192.168.21.2 netmask ffffff00 broadcast 192.168.21.255

ce5: flags=1000843 mtu 1500 index 4

inet 172.17.7.186 netmask ffffc000 broadcast 172.17.63.255

ether 0:14:4f:1f:3d:fb

ce7: flags=1000843 mtu 1500 index 8

inet 192.168.23.2 netmask ffffff00 broadcast 192.168.23.255

ether 0:14:4f:1f:1c:9d

ce7:1: flags=1000843 mtu 1500 index 8

inet 192.168.25.2 netmask ffffff00 broadcast 192.168.25.255

ce7:2: flags=1000843 mtu 1500 index 8

inet 192.168.24.2 netmask ffffff00 broadcast 192.168.24.255

ce9: flags=1000843 mtu 1500 index 5

inet 3.24.138.214 netmask fffffe00 broadcast 3.24.139.255

ether 0:14:4f:1f:1c:9f

cinohapdbp02sec #

ce2 is standby IP Multipathing (IPMP)

What is the difference between VIP and IPMP ?

================================================

IPMP can failover an address to another interface, but not failover to the other node.

Oracle VIP can failover to another interface on the same node or to another host in the cluster.

cinohapdbp02sec # srvctl modify nodeapps -n cinohapdbp02sec -A cinohapdbp02sec-vip/255.255.252.0/ce2\|ce0

cinohapdbp02sec # srvctl config nodeapps -n cinohapdbp02sec

VIP exists.: /cinohapdbp02sec-vip/3.112.45.206/255.255.252.0/ce2:ce0

GSD exists.

ONS daemon exists.

Listener does not exist.

cinohapdbp02sec # srvctl modify nodeapps -n cinohapdbp01sec -A cinohapdbp01sec-vip/255.255.252.0/ce2\|ce0

cinohapdbp02sec # srvctl start nodeapps -n cinohapdbp02sec

cinohapdbp02sec # srvctl status nodeapps -n cinohapdbp02sec

VIP is running on node: cinohapdbp02sec

GSD is running on node: cinohapdbp02sec

PRKO-2016 : Error in checking condition of listener on node: cinohapdbp02sec

ONS daemon is running on node: cinohapdbp02sec

cinohapdbp02sec #

Srvctl tools to add new db instance

------------------------------------

srvctl add database -d lab1 -o /lab1/oracle/product/10.2.0/lab1

srvctl add instance -d lab1 -i lab11 -n cinohapdbp01sec

srvctl add instance -d lab1 -i lab12 -n cinohapdbp02sec

srvctl setenv instance -d lab1 -i lab11 -t TNS_ADMIN=/lab1/oracle/product/10.2.0/lab1/network/admin/lab11_cinohapdbp01sec

srvctl setenv instance -d lab1 -i lab12 -t TNS_ADMIN=/lab1/oracle/product/10.2.0/lab1/network/admin/lab12_cinohapdbp02sec

3.112.45.199 cinohapdbp01sec cinohapdbp01sec.security.ge.com loghost

3.112.45.205 cinohapdbp01sec-vip cinohapdbp01sec-vip.security.ge.com

192.168.21.1 cinohapdbp01sec-priv

3.112.45.202 cinohapdbp02sec cinohapdbp02sec.security.ge.com loghost

3.112.45.206 cinohapdbp02sec-vip cinohapdbp02sec-vip.security.ge.com

192.168.21.2 cinohapdbp02sec-priv

$ORA_CRS_HOME/bin/oifcfg getif

cinohapdbp01sec $ ./oifcfg getif

ce2 3.112.44.0 global public

ce3 192.168.21.0 global cluster_interconnect

cinohapdbp02sec $ ./oifcfg getif

ce2 3.112.44.0 global public

ce3 192.168.21.0 global cluster_interconnect

from 10201

before cluster instll

cd /lab1/oradata/recovery/arch01/lab11/software/10201_cluster/cluvfy

./runcluvfy.sh comp nodecon -n cinohapdbp01sec,cinohapdbp02sec -verbose

./runcluvfy.sh stage -pre crsinst -n all -verbose

After Cluster install

$ORA_CRS_HOME/bin/cluvfy stage -post crsinst -n cinohapdbp01sec,cinohapdbp02sec

from 11.1.0.6

cd /lab1/oradata/recovery/arch01/lab11/software/11106_software/clusterware

./runcluvfy.sh stage -pre crsinst -n cinohapdbp01sec,cinohapdbp02sec -verbose

before db binray install

./runcluvfy.sh stage -pre dbinst -n cinohapdbp02sec,cinohapdbp01sec -verbose

Before dbca

./runcluvfy.sh stage -pre dbcfg -n cinohapdbp02sec,cinohapdbp01sec -d /lab1/oracle/product/11.1.0/lab1 -verbose

From CRS_HOME

220863/Target!18

TAF

---

Add the below entry in client and db tnsnames.ora

LAB1_TAF =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = cinohapdbp01sec-vip.security.ge.com)(PORT = 1529))

(ADDRESS = (PROTOCOL = TCP)(HOST = cinohapdbp02sec-vip.security.ge.com)(PORT = 1529))

(LOAD_BALANCE = yes)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = LAB1_TAF)

(FAILOVER_MODE =

(TYPE = SELECT)

(METHOD = BASIC)

(RETRIES = 180)

(DELAY = 5)

)

Check the failover Information

===============================

COLUMN instance_name FORMAT a13

COLUMN host_name FORMAT a9

COLUMN failover_method FORMAT a15

COLUMN failed_over FORMAT a11

SELECT

instance_name

, host_name

, NULL AS failover_type

, NULL AS failover_method

, NULL AS failed_over

FROM v$instance

UNION

SELECT

NULL

, NULL

, failover_type

, failover_method

, failed_over

FROM v$session

WHERE username = 'SYSTEM';

Check Veritas cluster file system installation

modinfo | grep vxfs

cinohapdbp01sec # hastatus -summary

-- SYSTEM STATE

-- System State Frozen

A cinohapdbp01sec RUNNING 0

A cinohapdbp02sec RUNNING 0

-- GROUP STATE

-- Group System Probed AutoDisabled State

B clustmon cinohapdbp01sec Y N ONLINE

B clustmon cinohapdbp02sec Y N OFFLINE

B cvm cinohapdbp01sec Y N ONLINE

B cvm cinohapdbp02sec Y N ONLINE

Check Veritas mount in the system

bash-3.00$ df -F vxfs

/crslab1/oracle (/dev/vx/dsk/cinohapdbp01secdg/crsprod_oracle): 9530832 blocks 148914 files

/lab1/oradata/system01(/dev/vx/dsk/lab1system_dg/system01):82405744 blocks 1287565 files

/lab1/oradata/recovery/redo04/lab12(/dev/vx/dsk/lab1redo_dg/redo04_2):26761600 blocks 418136 files

/lab1/oradata/recovery/redo04/lab11(/dev/vx/dsk/lab1redo_dg/redo04):26462288 blocks 413470 files

/backup/data (/dev/vx/dsk/bkpdata_dg/bkpdata):459034640 blocks 7172416 files

/lab1/oradata/custom01(/dev/vx/dsk/lab1custom_dg/custom01):47658544 blocks 744638 files

/lab1/oradata/data01(/dev/vx/dsk/lab1data_dg/data01):59864944 blocks 935387 files

/lab1/oradata/recovery/redo01/lab11(/dev/vx/dsk/lab1redo_dg/redo01):26446368 blocks 413216 files

/lab1/oradata/recovery/redo01/lab12(/dev/vx/dsk/lab1redo_dg/redo01_2):26705264 blocks 417243 files

/lab1/oradata/data05(/dev/vx/dsk/lab1data_dg/data05):102242272 blocks 1597506 files

/lab1/oradata/data04(/dev/vx/dsk/lab1data_dg/data04):96421840 blocks 1506561 files

/lab1/oradata/recovery/redo02/lab12(/dev/vx/dsk/lab1redo_dg/redo02_2):26705584 blocks 417271 files

/lab1/oradata/recovery/redo02/lab11(/dev/vx/dsk/lab1redo_dg/redo02):24309104 blocks 379828 files

/lab1/oradata/data02(/dev/vx/dsk/lab1data_dg/data02):81311632 blocks 1270493 files

/lab1/oradata/data03(/dev/vx/dsk/lab1data_dg/data03):90363840 blocks 1411904 files

/lab1/oradata/recovery/redo03/lab11(/dev/vx/dsk/lab1redo_dg/redo03):24426528 blocks 381658 files

/lab1/oradata/recovery/redo03/lab12(/dev/vx/dsk/lab1redo_dg/redo03_2):26761600 blocks 418136 files

/ocrvote (/dev/vx/dsk/ocrvotedg1/ocrvotevol): 1956522 blocks 244564 files

/lab1/oracle (/dev/vx/dsk/lab1bin_dg/oracle):14202576 blocks 221888 files

/etc/VRTSvcs/conf/config/main.cf

Online redo log file

--------------------

select l.group#, thread#, member

from v$log l, v$logfile lf

where l.group#=lf.group#;

select THREAD#,GROUPS,INSTANCE,SEQUENCE# from V$thread;

HAPPY LEARNING!

Ajith Pathiyil

Announcement!

Dear Patron(s),

Friday, October 21, 2011

Understand Oracle Locking

Overview

Modes of Locking

Description of each Lock Mode

Data Lock Conversion Versus Lock Escalation

Deadlocks

Referential Integrity Locks (RI Locks)

Blocking Locks

Distributed Transactions

ITL Entry Shortages

Conclusion

Rollback Of Single Transaction

Thursday, October 13, 2011

ORACLE PROCESSES, SESSIONS AND CONNECTIONS

Friday, September 30, 2011

RAC Cheatsheet-2

Blog Folllowing