“ORA-20110: set stamp set count conflict” after upgrading to 12.2 Recovery Catalog

This is the second time we have hit this error after upgrading the recovery catalog to 12.2.

The first is now documented by Oracle as Document 2291791.1, and affects 12.1 multi-tenant instances with DBBP 12.1.0.2.161018 or later applied.
If you have applied that patchset, you can see the problem by checking the contents of file $ORACLE_HOME/rdbms/admin/recover.bsq

  cursor bs(low_recid number, high_recid number) is
    select bs.recid, bs.stamp, bs.set_stamp, bs.set_count, bs.backup_type,
           bs.incremental_level,
           bs.pieces, start_time, completion_time, controlfile_included,
           bs.input_file_scan_only, keep_until,
           decode (bs.keep_options, 'LOGS'      , KEEP_LOGS
                               , 'NOLOGS'       , KEEP_NOLOGS
                               , 'BACKUP_LOGS'  , KEEP_CONSIST
                                                , 0) keep_options,
           bs.block_size, bs.multi_section, bs.guid,
           decode(bs.con_id, invalid_pdbid, 1, 0) dropped_pdb
    from v$backup_set bs, v$containers pdb
    where bs.recid between low_recid and high_recid
      and (bs.stamp >= kccdivts OR bs.recid = high_recid)
      and bs.stamp >= resyncstamp
      and bs.for_xtts != 'YES'
    order by bs.recid;

Note that the join has been missed between v$backup_set and v$containers(!). The next cursor in the source, bp, also has the same ommission.
You can work-around the issue by adding the following condition to the two cursors, but you should probably check with Oracle Support first.

    
 and bs.con_id = pdb.con_id(+) 

The latest issue is on a 11g (hence single tenant obviously) on a dataguard standby site (in this case we take backups from both primary and standby instances).
Tracing the resync of the catalog from the standby site, the issue begins when inserting to table bs (backup sets) in the recovery catalog.
The insert fails because of a violation the unique key BS_U2 on columns db_key, set_stamp and set_count.
Checking the recovery catalog there already is record for this combination of set_stamp and set_count, but from the primary instance.

My suspicion is that the unique key also needs to include site_key field, however I don’t have enough of an understanding of what the set_stamp, set_count combination represents to be sure.

The only reference to the fields I can find come quotes in the documentation such as the following:

The SET_STAMP value from V$BACKUP_SET. SET_STAMP and SET_COUNT form a concatenated key that uniquely identifies this record in the target database control file.

What does seem apparent is that the 12.2 recovery catalog is stricter in enforcing this uniqueness, thus revealing some bugs elsewhere in the codebase.

Instance crash when not restarted through clusterware after patching

We had some issues, fortunately on a test instance, after applying a database
patch. With the help of a colleague (thanks Kei) I was able to identify what had
gone wrong and setup a testcase to demonstrate the issue. I think it is worth
sharing as it reveals some of the complexities of the relationships between
the components of the Oracle stack.

My setup consists of two database instances on the same host configured in Oracle Restart. To start the patching process I stop both instances. Note in particular the group owner of oracle binary is asmadmin.

[oracle@hkexdb01 ~]$ srvctl stop database -db PVJA -o abort
[oracle@hkexdb01 ~]$ srvctl stop database -db PVJB -o abort
[oracle@hkexdb01 ~]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 327575894 Jun 13 14:52 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

Next I apply database patch (I have omitted some details from the
patch session as they are not relevant). Observe that after patch completes,
oracle binary has been re-linked, and owner is now oinstall.

[oracle@hkexdb01 ~]$ cd ~/25397136
[oracle@hkexdb01 25397136]$ $ORACLE_HOME/OPatch/opatch apply
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-06-13_14-58-15PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   24732088  25397136  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying sub-patch '24732088' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

--DETAILS OMITTED

Patching component oracle.rdbms.install.plugins, 12.1.0.2.0...
Composite patch 25397136 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-06-13_14-58-15PM_1.log

OPatch succeeded.
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

After patching is complete, I now restart PVJA outside of
clusterware (note the group owner of oracle binary is still oinstall).

[oracle@hkexdb01 25397136]$ export ORACLE_SID=PVJA
[oracle@hkexdb01 25397136]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Tue Jun 13 15:00:54 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SYS@PVJA> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size                  2926472 bytes
Variable Size            1564297336 bytes
Database Buffers          570425344 bytes
Redo Buffers                9834496 bytes
Database mounted.
Database opened.
SYS@PVJA> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

I then restart PVJB instance through clusterware. Note that clusterware
updates the group owner of the oracle binary.

srvctl start database -db PVJB 
ls -alrt $ORACLE_HOME/bin/oracle

[oracle@hkexdb01 25397136]$ srvctl start database -db PVJB 
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle
[oracle@hkexdb01 25397136]$ 

This change in group ownership ‘breaks’ the PVJA instance. There are lots of errors in the alert log.

Errors in file /u01/app/oracle/diag/rdbms/pvja/PVJA/trace/PVJA_j000_64968.trc:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1001 (oinstall), current egid = 1006 (asmadmin)

Note the clusterware log seems to indicate the point at which this action took
place.

2017-06-13 14:55:47.233925 :CLSDYNAM:4045649664: [ora.pvjb.db]{1:31716:24755} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/12.1.0.2/grid cmdname = setasmgidwrap

This behaiour seems to match Bug 9784037 : SETASMGID CAUSING ORA-27303, which
was closed as “Not a Bug”. The provided workaround is “always remember to execute setasmgidwrap after doing relink all / patch apply in RDBMS Oracle Home”.
However I think just making sure that all instances are restarted through
clusterware is probably a better way to handle this.

Update, seems Frits Hoogland has run into this issue too, I now feel a bit better about my ‘mistake’.

OPatch 12.2.0.1.9 change in behaviour patching RAC nodes

We’ve had issues in the past patching the Oracle Homes on our Exadata nodes due to the fact that they were defined as part of a RAC cluster.
We don’t use RAC, so we have made sure we don’t encounter such problems in the past by unlinking them as described in the documentation.

However I have maintained a separate pair of linked Oracle homes for testing, and I have a reproducible test case of OPatch getting into a mess under certain conditions.
Running this testcase, I have noticed some change in behavior in this area with latest version of OPatch 12.2.0.1.9 compared to previous version 12.2.0.1.8.

Review the following testcase using OPatch 12.2.0.1.8. A few things to note.
First by default patches are applied to both nodes (this can be overridden by specifying -local flag).
Second the confirmation prompt regarding the remote node takes place after the local node has been patched.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch.backup/opatch apply
Oracle Interim Patch Installer version 12.2.0.1.8
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.8
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-36-03PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

This node is part of an Oracle Real Application Cluster.
Remote nodes: 'hkexdb02' 
Local node: 'hkexdb01'
Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '22652097' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

Patching component oracle.rdbms, 12.1.0.2.0...

Patching component oracle.rdbms.dbscripts, 12.1.0.2.0...
Skip copying to "/u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/admin/catfusrg.sql" because it is the same as 
the file in incoming patch "/home/oracle/patchtest/22652097/files/rdbms/admin/catfusrg.sql"

Patching in rolling mode.


The node 'hkexdb02' will be patched next.


Please shutdown Oracle instances running out of this ORACLE_HOME on 'hkexdb02'.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')

Is the node ready for patching? [y|n]
y
User Responded with: Y
Updating nodes 'hkexdb02' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt" with actual path.
Running command on remote node 'hkexdb02': 
cd /u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/lib; /usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_2 || echo REMOTE_MAKE_FAILED::>&2 


The node 'hkexdb02' has been patched.  You can restart Oracle instances on it.


Finished patching in rolling mode.

Patch 22652097 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-36-03PM_1.log

OPatch succeeded.
[oracle@hkexdb01 22652097]$ 

Compare that with an attempt to apply same patch using OPatch 12.2.0.1.9 using the same command; by default, only the local node will be patched.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch/opatch apply       
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-39-06PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
n
User Responded with: N
NApply exits on request
UtilSession: NApply exits on user's request.
UtilSession system modification phase did not start: NApply exits on request
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-39-06PM_1.log

OPatch stopped on request.
[oracle@hkexdb01 22652097]$ 

To get 12.2.0.1.9 to patch remote nodes as well, -all_nodes switch must be provided, as demonstrated below.  Observe also that confirmation regards patching all nodes takes place before patching any of them.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch/opatch apply -all_nodes
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-40-05PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

This node is part of an Oracle Real Application Cluster.
Remote nodes: 'hkexdb02' 
Local node: 'hkexdb01'
Please shut down Oracle instances running out of this ORACLE_HOME on all the nodes.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Are all the nodes ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '22652097' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

Patching component oracle.rdbms, 12.1.0.2.0...

Patching component oracle.rdbms.dbscripts, 12.1.0.2.0...

Patching in all-node mode.

Updating nodes 'hkexdb02' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt" with actual path.
Running command on remote node 'hkexdb02': 
cd /u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/lib; /usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_2 || echo REMOTE_MAKE_FAILED::>&2 


All nodes have been patched.  You may start Oracle instances on the local system and nodes 'hkexdb02' 

Patch 22652097 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-40-05PM_1.log

OPatch succeeded.
[oracle@hkexdb01 22652097]$ 

I think the change to prompt for confirmation before patching any nodes is an improvement.  Regarding the change of defaults for patching the local node vs patching all nodes, this doesn’t affect us (as we’ve unlinked our homes) but the change might cause some issues for some.

Update 2017-05-26, this change in behaviour is documented in the following MOS note:

OPatch: Behavior Changes starting in OPatch 12.2.0.1.9 and 11.2.0.3.16 releases (Doc ID 2232156.1)

Changed READ ANY TABLE audit behavior in 12.2

We have noticed a change in the behavior of READ ANY TABLE auditing in 12.2.

The following initial setup is performed as SYS:

SQL> AUDIT
  2      SELECT ANY TABLE;

Audit succeeded.

SQL>
SQL> AUDIT
  2      READ ANY TABLE;

Audit succeeded.

SQL>
SQL> CREATE USER tst_table_owner IDENTIFIED BY tst_table_owner
  2      ACCOUNT LOCK;

User TST_TABLE_OWNER created.

SQL>
SQL> CREATE USER tst_view_owner IDENTIFIED BY tst_view_owner
  2      ACCOUNT LOCK;

User TST_VIEW_OWNER created.

SQL>
SQL> GRANT
  2      READ ANY TABLE
  3  TO tst_view_owner WITH ADMIN OPTION;

Grant succeeded.

SQL>
SQL> GRANT
  2      SELECT ANY TABLE
  3  TO tst_view_owner WITH ADMIN OPTION;

Grant succeeded.

SQL>
SQL> CREATE USER tst_login IDENTIFIED BY tst_login;

User TST_LOGIN created.

SQL>
SQL> GRANT
  2      CREATE SESSION
  3  TO tst_login;

Grant succeeded.

SQL>
SQL> CREATE TABLE tst_table_owner.tst_table
  2      AS
  3          SELECT
  4              *
  5          FROM
  6              dual
  7          WHERE
  8              1 = 0;

Table TST_TABLE_OWNER.TST_TABLE created.

SQL>
SQL> CREATE VIEW tst_view_owner.tst_view AS
  2      SELECT
  3          *
  4      FROM
  5          tst_table_owner.tst_table;

View TST_VIEW_OWNER.TST_VIEW created.

SQL>
SQL> GRANT SELECT ON tst_view_owner.tst_view TO tst_login;

Grant succeeded.

SQL>
SQL> GRANT SELECT ON dba_capture TO tst_login;

Grant succeeded.

The following select statements are then run as newly created tst_login user:

SELECT
    COUNT(*)
FROM
    tst_view_owner.tst_view;

SELECT
    COUNT(*)
FROM
    dba_capture;

Querying the audit trail in 12.1 shows no entries, however in 12.2:

SQL> SELECT
  2      db_user,
  3      object_schema,
  4      object_name,
  5      sql_text,
  6      priv_used
  7  FROM
  8      dba_common_audit_trail
  9  WHERE
 10          extended_timestamp > SYSDATE - 2 / 24 / 60
 11      AND
 12          statement_type = 'SELECT';

DB_USER    OBJECT_SCHEMA    OBJECT_NAME               SQL_TEXT  PRIV_USED

TST_LOGIN  TST_TABLE_OWNER  TST_TABLE                           READ ANY TABLE
TST_LOGIN  TST_VIEW_OWNER   TST_VIEW
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              STREAMS$_APPLY_PROCESS
TST_LOGIN  SYS              STREAMS$_APPLY_MILESTONE
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              STREAMS$_APPLY_PROCESS
TST_LOGIN  SYS              STREAMS$_APPLY_MILESTONE
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              STREAMS$_CAPTURE_PROCESS
TST_LOGIN  SYS              _DBA_CAPTURE
TST_LOGIN  SYSTEM           LOGMNR_SESSION$                     READ ANY TABLE
TST_LOGIN  SYS              DBA_LOGMNR_SESSION
TST_LOGIN  SYS              USER$
TST_LOGIN  SYS              _SXGG_DBA_CAPTURE
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              STREAMS$_CAPTURE_PROCESS
TST_LOGIN  SYS              _DBA_CAPTURE
TST_LOGIN  SYSTEM           LOGMNR_SESSION$                     READ ANY TABLE
TST_LOGIN  SYS              DBA_LOGMNR_SESSION
TST_LOGIN  SYS              USER$
TST_LOGIN  SYS              _SXGG_DBA_CAPTURE
TST_LOGIN  SYS              DBA_CAPTURE

34 rows selected.

It appears that READ ANY TABLE is audited when querying a view, if the privilege is required by  the view owner to access the base table.

Oracle support have now confirmed that this is not the expected behavior, and have created Bug 26035911 : AUDIT RECORDS GENERATED EVEN WHEN THE SYSTEM PRIVILEGE IS NOT EXERCISED IN 12.2

12cR2 Incrementally Updated Backups and dropped datafiles

We have just noticed a difference in behavior in 12cR2 with regards to image copies being marked as obsolete after the backup is updated past the drop of a datafile.
I won’t describe the feature itself, if necessary you can read up at oracle-base or the Oracle documentation.

First review output from the testcase on a 12.1 instance, note that after dropping the datafile, and updating the backup past this point, the datafile copy is marked as obsolete:

RMAN> show all;

RMAN configuration parameters for database with db_unique_name ADAPTEST are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE RMAN OUTPUT TO KEEP FOR 7 DAYS; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/snapcf_ADAPTEST.f'; # default

RMAN> create tablespace test datafile size 1m autoextend off;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 2017-04-19 12:47:09
using channel ORA_DISK_1
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 6 found
no parent backup or copy of datafile 5 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=+DATAC1/ADAPTEST/DATAFILE/system.401.941128155
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSTEM_FNO-1_0hs2302d tag=TEST_OBSOLETE RECID=19 STAMP=941719632
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATAC1/ADAPTEST/DATAFILE/sysaux.397.941128123
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSAUX_FNO-3_0is2302k tag=TEST_OBSOLETE RECID=20 STAMP=941719639
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+DATAC1/ADAPTEST/DATAFILE/undotbs1.403.941128201
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-UNDOTBS1_FNO-4_0js2302r tag=TEST_OBSOLETE RECID=21 STAMP=941719644
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00006 name=+DATAC1/ADAPTEST/DATAFILE/users.399.941128199
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-USERS_FNO-6_0ks2302s tag=TEST_OBSOLETE RECID=22 STAMP=941719644
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=+DATAC1/ADAPTEST/DATAFILE/test.1051.941719625
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-TEST_FNO-5_0ls2302t tag=TEST_OBSOLETE RECID=23 STAMP=941719645
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 2017-04-19 12:47:26

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:26
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-00 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:27

RMAN> drop tablespace test including contents and datafiles;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 2017-04-19 12:47:41
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATAC1/ADAPTEST/DATAFILE/system.401.941128155
input datafile file number=00003 name=+DATAC1/ADAPTEST/DATAFILE/sysaux.397.941128123
input datafile file number=00004 name=+DATAC1/ADAPTEST/DATAFILE/undotbs1.403.941128201
input datafile file number=00006 name=+DATAC1/ADAPTEST/DATAFILE/users.399.941128199
channel ORA_DISK_1: starting piece 1 at 2017-04-19 12:47:41
channel ORA_DISK_1: finished piece 1 at 2017-04-19 12:47:42
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1 tag=TEST_OBSOLETE comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2017-04-19 12:47:42

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:42
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-01 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:43

RMAN> recover copy of database with tag 'test_obsolete';

Starting recover at 2017-04-19 12:47:55
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile copies to recover
recovering datafile copy file number=00001 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSTEM_FNO-1_0hs2302d
recovering datafile copy file number=00003 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSAUX_FNO-3_0is2302k
recovering datafile copy file number=00004 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-UNDOTBS1_FNO-4_0js2302r
recovering datafile copy file number=00006 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-USERS_FNO-6_0ks2302s
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1
channel ORA_DISK_1: piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1 tag=TEST_OBSOLETE
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished recover at 2017-04-19 12:47:56

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:56
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-02 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:57

RMAN> report obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of obsolete backups and copies
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Datafile Copy        23     2017-04-19 12:47:25 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-TEST_FNO-5_0ls2302t
Backup Set           7      2017-04-19 12:47:26
  Backup Piece       7      2017-04-19 12:47:26 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-00
Backup Set           8      2017-04-19 12:47:42
  Backup Piece       8      2017-04-19 12:47:42 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1
Backup Set           9      2017-04-19 12:47:42
  Backup Piece       9      2017-04-19 12:47:42 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-01

Compare this with output from the same testcase on 12.2 instance, the datafile copy is not marked as obsolete:

RMAN> show all;

RMAN configuration parameters for database with db_unique_name PVJTEST are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE RMAN OUTPUT TO KEEP FOR 7 DAYS; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/12.2.0.1/dbhome/dbs/snapcf_PVJTEST.f'; # default

RMAN> create tablespace test datafile size 1m autoextend off;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 19-APR-17
using channel ORA_DISK_1
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 7 found
no parent backup or copy of datafile 5 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/u02/oradata/PVJTEST/sysaux01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSAUX_FNO-3_0vs2304k tag=TEST_OBSOLETE RECID=36 STAMP=941719708
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/u02/oradata/PVJTEST/system01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSTEM_FNO-1_10s23053 tag=TEST_OBSOLETE RECID=37 STAMP=941719720
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/u02/oradata/PVJTEST/undotbs01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-UNDOTBS1_FNO-4_11s2305a tag=TEST_OBSOLETE RECID=38 STAMP=941719724
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile copy
input datafile file number=00007 name=/u02/oradata/PVJTEST/users01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-USERS_FNO-7_12s2305d tag=TEST_OBSOLETE RECID=39 STAMP=941719726
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=/u02/oradata/PVJTEST/datafile/o1_mf_test_dhfv0gk3_.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-TEST_FNO-5_13s2305f tag=TEST_OBSOLETE RECID=40 STAMP=941719727
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-0e comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> drop tablespace test including contents and datafiles;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 19-APR-17
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u02/oradata/PVJTEST/sysaux01.dbf
input datafile file number=00001 name=/u02/oradata/PVJTEST/system01.dbf
input datafile file number=00004 name=/u02/oradata/PVJTEST/undotbs01.dbf
input datafile file number=00007 name=/u02/oradata/PVJTEST/users01.dbf
channel ORA_DISK_1: starting piece 1 at 19-APR-17
channel ORA_DISK_1: finished piece 1 at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1 tag=TEST_OBSOLETE comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:15
Finished backup at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-0f comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> recover copy of database with tag 'test_obsolete';

Starting recover at 19-APR-17
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile copies to recover
recovering datafile copy file number=00001 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSTEM_FNO-1_10s23053
recovering datafile copy file number=00003 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSAUX_FNO-3_0vs2304k
recovering datafile copy file number=00004 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-UNDOTBS1_FNO-4_11s2305a
recovering datafile copy file number=00007 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-USERS_FNO-7_12s2305d
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1
channel ORA_DISK_1: piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1 tag=TEST_OBSOLETE
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
Finished recover at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-10 comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> report obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of obsolete backups and copies
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set           22     19-APR-17         
  Backup Piece       22     19-APR-17          /u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1

Bind Sensitivity and PL/SQL cursor caching

I have been investigating a PL/SQL process which was failing because a non-optimal plan was being used due to a combination of Bind Variable Peeking and data skew.
I put together a reproducible test case executing the problem SQL from SQLPlus, and found that addition of a /*+bind_aware*/ hint to inform CBO that this query was bind sensitive seemed to resolve the issue.

However retesting via a PL/SQL showed that the plan being used was still that of the first bind variables encountered.

I have simplified the testcase to more easily reproduce the behavior I was seeing (database version 12.1.0.2):

create table skew_table nologging as
   select rownum id,  
        case mod(rownum,  10000) when 0 then 0 else 1 end c10000,
        rpad('X', 255, 'X') padding
        from dual
        connect by level <= 1e6; create index skew_index on skew_table(c10000); exec dbms_stats.gather_table_stats(NULL, 'SKEW_TABLE', METHOD_OPT=>'FOR ALL COLUMNS SIZE SKEWONLY');

create or replace procedure get_skew(p in number)
is 
   dummy number;
begin
   select /*+ bind_aware sktest */ count(*) INTO dummy FROM skew_table where c10000 = p;
end;
/

declare
   dummy number;
begin
   get_skew(0);
   get_skew(1);
end;   
/

Checking v$sql we can see only one plan was generated:

select child_number, executions, parse_calls, is_bind_sensitive, is_bind_aware  from v$sql where sql_id = '1rg2w46daksr4';  

CHILD_NUMBER EXECUTIONS PARSE_CALLS I I                                                                                                                             
------------ ---------- ----------- - -                                                                                                                             
           0          2           1 Y Y    

I guessed that PL/SQL cursor caching was resulting in even the soft parse being skipped, so disabled this optimization in the calling code:

declare
   dummy number;
begin
   execute immediate 'ALTER SESSION SET session_cached_cursors = 0';
   get_skew(0);
   get_skew(1);
   execute immediate 'ALTER SESSION RESET session_cached_cursors';
end;   
/

You can see now that the cursor is re-parsed, and that appropriate plans based on bind variables is now taking place correctly:

SQL> select child_number, executions, parse_calls, is_bind_sensitive, is_bind_aware  from v$sql where sql_id = '1rg2w46daksr4';                                     
                                                                                                                                                                    
CHILD_NUMBER EXECUTIONS PARSE_CALLS I I                                                                                                                             
------------ ---------- ----------- - -                                                                                                                             
           0          3           3 Y Y                                                                                                                             
           1          1           0 Y Y                                                                                                                             

Update: while this was interesting to investigate, for the live issue I have simply locked in a plan that is acceptable (if not optimal) for all values.

Oracle Connection Manager (cman) registration_invited_nodes and registration_excluded_nodes

We make use Oracle Connection Manager to control client access to our database servers, it’s pretty cool technology that I’m always surprised how few people are aware of its existence.

Our installed version is 12.1, and this is unable to communicate with our newly upgraded 12.2 instance, so I have been working through installing the latest version.  On starting the instance up, I noticed that no services were being registered, and the log contained multiple entries like the following:

Listener(VNCR option 1) rejected Registration request from destination
23-MAR-2017 14:37:37 * service_register_NSGR * 1182
TNS-01182: Listener rejected registration of service ""

This functionality is described here:

Starting with this release, valid node checking for registration is enabled by default in Oracle Connection Manager. By default, only the databases on local node are allowed to register with Oracle Connection Manager. The databases on remote node can be permitted or denied to register with Oracle Connection Manager through the use of parameters REGISTRATION_INVITED_NODES and REGISTRATION_EXCLUDED_NODES.

Check out the examples for these parameters in the documentation:

REGISTRATION_EXCLUDED_NODES = (10.1.26.*, 10.16.40.0/24, \
                                       2001:DB8:3eff:fe38, node2)
REGISTRATION_INVITED_NODES = (10.1.35.*, 10.1.34.0/24, \
                                       2001:DB8:fe38:7303, node1)

Based on that example I updated the cman.ora configuration to include a valid node. Note that 255.255.255.255 is just an example for testing. Note also that I have configured cman to listen on a non-default port, 999, rather than the default of 1521.

cman_vmhklftdbocmdv2.alpha.lftltd.net =
(configuration=
  (address=(protocol=tcp)(host=vmhklftdbocmdv2.alpha.lftltd.net)(port=999))
  (parameter_list =
    (registration_invited_nodes = (255.255.255.255))
    (connection_statistics=yes)
    (log_level=user)
    (trace_level=user)
    (trace_filelen=1000)
    (trace_fileno=3)
    (idle_timeout=14400)
  )
  (rule_list=
    (rule= (src=vmhklftdbocmdv2.alpha.lftltd.net)(dst=::1)(srv=cmon)(act=accept))
     )
)

Note when I run connection manager, it has picked up the default port of 1521, rather than 999 which I specified:

-bash-4.1$ cmctl

CMCTL for Linux: Version 12.2.0.1.0 - Production on 23-MAR-2017 16:08:03

Copyright (c) 1996, 2016, Oracle.  All rights reserved.

Welcome to CMCTL, type "help" for information.

CMCTL> administer
Current instance CMAN_vmhklftdbocmdv2.alpha.lftltd.net is not yet started
Connections refer to (ADDRESS=(PROTOCOL=TCP)(HOST=vmhklftdbocmdv2)(PORT=1521)).
The command completed successfully.
CMCTL:CMAN_vmhklftdbocmdv2.alpha.lftltd.net> 

Basically what has happened is that cman has parsed the cman.ora file, and it doesn’t like the invited_nodes list to be in parentheses. Rather than throwing out an error it just continues with default values.
I remove the brackets, and adding an extra IP address just for fun:

    (registration_invited_nodes = 1.1.1.1, 255.255.255.255)

When connecting now, observe that the settings from the parameter file are being used (note the non-default port, 999):

-bash-4.1$ cmctl

CMCTL for Linux: Version 12.2.0.1.0 - Production on 23-MAR-2017 16:12:16

Copyright (c) 1996, 2016, Oracle.  All rights reserved.

Welcome to CMCTL, type "help" for information.

CMCTL> administer
Current instance CMAN_vmhklftdbocmdv2.alpha.lftltd.net is not yet started
Connections refer to (DESCRIPTION=(address=(protocol=tcp)(host=vmhklftdbocmdv2.alpha.lftltd.net)(port=999))).
The command completed successfully.
CMCTL:CMAN_vmhklftdbocmdv2.alpha.lftltd.net> 

I observed the same behavior with REGISTRATION_EXCLUDED_NODES.
While I was in the area, I also noticed that the described default value for VALID_NODE_CHECKING_REGISTRATION appears to be wrong. The documentation specified that the default value is off, which contradicts the previously highlighted section of the documentation, and my own testing.

I’m off to file a couple of documentation bugs…

Example of using VPD with an Application Context

SQL> CREATE OR REPLACE PROCEDURE vpdtest_context_procedure(object_type IN VARCHAR2)                                                                                                           
  2  IS                                                                                                                                                                                       
  3  BEGIN                                                                                                                                                                                    
  4     DBMS_SESSION.SET_CONTEXT('vpdtest_context', 'object_type', object_type);                                                                                                              
  5  END;                                                                                                                                                                                     
  6  /                                                                                                                                                                                        
                                                                                                                                                                                              
Procedure created.                                                                                                                                                                            
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> CREATE CONTEXT vpdtest_context using vpdtest_context_procedure;                                                                                                                          
                                                                                                                                                                                              
Context created.                                                                                                                                                                              
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> CREATE TABLE vpdtest_table AS SELECT * FROM dba_objects;                                                                                                                                 
                                                                                                                                                                                              
Table created.                                                                                                                                                                                
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> BEGIN                                                                                                                                                                                    
  2     DBMS_RLS.ADD_POLICY(object_name => 'vpdtest_table',                                                                                                                                   
  3                         policy_name => 'vpdtest_policy',                                                                                                                                  
  4                         policy_function => 'vpdtest_policy_function');                                                                                                                    
  5  END;                                                                                                                                                                                     
  6  /                                                                                                                                                                                        
                                                                                                                                                                                              
PL/SQL procedure successfully completed.                                                                                                                                                      
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> CREATE OR REPLACE FUNCTION vpdtest_policy_function(schema VARCHAR2,                                                                                                                      
  2                                                     tab VARCHAR2) RETURN VARCHAR2                                                                                                         
  3  IS                                                                                                                                                                                       
  4  BEGIN                                                                                                                                                                                    
  5     RETURN 'OBJECT_TYPE = sys_context( ''vpdtest_context'', ''object_type'' )';                                                                                                           
  6  END;                                                                                                                                                                                     
  7  /                                                                                                                                                                                        
                                                                                                                                                                                              
Function created.                                                                                                                                                                             
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> EXEC vpdtest_context_procedure('TABLE');                                                                                                                                                 
                                                                                                                                                                                              
PL/SQL procedure successfully completed.                                                                                                                                                      
                                                                                                                                                                                              
SQL> SELECT DISTINCT object_type FROM vpdtest_table;                                                                                                                                          
                                                                                                                                                                                              
OBJECT_TYPE                                                                                                                                                                                   
-----------------------                                                                                                                                                                       
TABLE                                                                                                                                                                                         
                                                                                                                                                                                              
SQL>                                                                                                                                                                                          
SQL> EXEC vpdtest_context_procedure('VIEW');                                                                                                                                                  
                                                                                                                                                                                              
PL/SQL procedure successfully completed.                                                                                                                                                      
                                                                                                                                                                                              
SQL> SELECT DISTINCT object_type FROM vpdtest_table;                                                                                                                                          
                                                                                                                                                                                              
OBJECT_TYPE                                                                                                                                                                                   
-----------------------                                                                                                                                                                       
VIEW                                                                                                                                                                                          

“enq: TX – allocate ITL entry” wait event on index block

Over the weekend, some processes on one of our databases locked up.  Support killed some sessions to get things back running, but on Monday I wanted to try to drill into what had occurred.

Querying dba_hist_active_sess_history I identified most problem sessions were waiting on the event “enq: TX – allocate ITL entry”.

For more details about what this event is, and why it happens, Arup Nanda has a good blog post (with pictures) about why it occurs.  In summary if a session wants to update a block, but there is no  remaining space in the block to create a new ITL entry, then it has to wait for one of the sessions currently holding an existing entry to commit (or rollback).

dba_hist_active_sess_history also identified, via the current_obj# column, that the object involved was  an index on a STATUS field, and I used the current_file#, and current_block# from one of the records to identify one of the blocks involved.  Below are some extracts from the block dump:

 seg/obj: 0x4bf2e  csc: 0x49.7a33a31d  itc: 169  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x70490 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0069.004.00000004  0x00c01ce8.0001.41  C---    0  scn 0x0049.78de4f73
0x02   0x012a.000.00000004  0x00c0f996.0002.22  --U-    1  fsc 0x000d.7a33aa37
0x03   0x016c.003.0000000b  0x00c0584a.0010.07  --U-    1  fsc 0x000d.7a35cd79
0x04   0x0144.00e.0000000b  0x00c0e3fa.0004.2e  --U-    1  fsc 0x000d.7a33aa24
0x05   0x0130.00c.00000005  0x00c033cf.0000.34  C---    0  scn 0x0049.7a2ec617
0x06   0x0005.009.0006b8f0  0x00c0f190.8b4b.1c  --U-    1  fsc 0x000d.7a33a967
0x07   0x0160.011.00000004  0x00c04e80.0001.0c  C---    0  scn 0x0049.7a2ec71a
0x08   0x0115.002.00000005  0x00c03082.0001.02  --U-    1  fsc 0x000d.7a33a9f8
0x09   0x0028.001.0000000e  0x00c0e93b.0010.01  C---    0  scn 0x0049.7a2eca94
0x0a   0x00d4.005.00000005  0x00c0275d.0001.1e  --U-    1  fsc 0x000d.7a33aa49
0x0b   0x014c.005.00000007  0x00c0ee84.0003.25  C---    0  scn 0x0049.7a2eb3fb
0x0c   0x00a4.010.0000000b  0x00c0e6ad.0010.40  --U-    1  fsc 0x000d.7a35ce41
...
0xa6   0x000f.015.0006bd01  0x00c25946.7676.21  --U-    1  fsc 0x000d.7a33a763
0xa7   0x0025.002.00000011  0x00c051b2.0011.17  --U-    1  fsc 0x000d.7a33a711
0xa8   0x0151.004.00000004  0x00c04afe.0000.09  --U-    1  fsc 0x000d.7a33a8c4
0xa9   0x00fc.01d.00000003  0x00c02b51.0001.0e  --U-    1  fsc 0x000d.7a33a785
...
row#0[1109] flag: ---D---, lock: 169, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5a
row#1[1098] flag: ---D---, lock: 164, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5b
row#2[1087] flag: ---D---, lock: 75, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 8b 00 5c
row#3[3969] flag: ---DS--, lock: 0, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 93 00 05
row#4[1307] flag: ---D---, lock: 155, len=11
col 0; len 1; (1):  30
col 1; len 6; (6):  00 09 c8 9e 00 51
...
row#173[2330] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 14
row#174[2319] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 15
row#175[2308] flag: -------, lock: 0, len=11
col 0; len 1; (1):  35
col 1; len 6; (6):  00 07 04 8c 00 16

In particular note that the block contains 175 index entries and 169 ITL entries.  This makes me suspect that 168 transactions were accessing the block simultaneously (one ITL entry is reserved by Oracle for leaf node splits).

Time to look at the process and understand what is occurring.  A highly simplified version of the process is provided below:

INSERT RECORD INTO EVENT_TABLE STATUS 0
DO SOME STUFF
UPDATE RECORD SET STATUS = 9
DO SOME MORE STUFF
COMMIT

As this is process is provided via a web-service, many sessions can be performing this task at the same time.  At this point I had a pretty good idea about what was occurring:

  • Many sessions each insert a row into EVENT_TABLE at the same time
  • Index entries for all these new records are likely to be in the same block (the index is on status, all these rows have the same value for status).
  • Each session will need to grab an ITL entry from this same index block to perform the update (note index entry update is really done as a delete and then an insert)
  • Other sessions will need to wait until ITL entries are released

Based on this hypothesis, time to build a test case:

CREATE TABLE ITL_TEST_TABLE (ID NUMBER PRIMARY KEY,
                             STATUS VARCHAR2(2 CHAR) DEFAULT '0');
CREATE SEQUENCE IT_TEST_SEQUENCE;
CREATE INDEX ITL_TEST_STATUS ON ITL_TEST_TABLE(STATUS);

CREATE OR REPLACE PROCEDURE ITL_TEST_PROC(sleep number) 
AS
   INSERTED_ID NUMBER;
BEGIN
   INSERT INTO ITL_TEST_TABLE (ID, STATUS) VALUES (IT_TEST_SEQUENCE.NEXTVAL, NULL) RETURNING ID INTO INSERTED_ID;
   UPDATE ITL_TEST_TABLE SET STATUS = '9' WHERE ID = INSERTED_ID;
   DBMS_LOCK.SLEEP(sleep);
   COMMIT;
END;   
/
>cat test_itl.sql
exec itl_test_proc(60);
exit;
>cat test_itl.sh:
#!/bin/bash
for i in {1..200}; do 
   nohup sqlplus / as sysdba @test_itl &
done;

When I execute test_itl.sh and query v$session I can see there are 32 sessions waiting on ‘enq: TX – allocate ITL entry’.  Exactly the same as the real incident, 168 of these sessions can run this code simultaneously, the rest are blocked.

I am following up with the developers about resolving the issue, some of my ideas are:

  • Initially create record with status NULL, so initial index entry will not be created.
  • Optimize the logic in process so that transactions do not hold uncommitted data for so long
  • Hold off on creating row until end of process until status is known, and then just inserting rather than inserting and then updating.

Creating SQL Plan Baseline from a different SQL statement

Really just documenting this for myself because I couldn’t easily find this on the web.  Most of the steps are taken from this page:

http://askdba.org/weblog/2011/12/plan-stability-using-sql-profiles-and-sql-plan-management/

Observe the following query doing a full table scan:

select * from table(dbms_xplan.display_cursor('8jng4anj19km4',
                                               format=>'peeked_binds'));

SELECT COUNT(1) FROM TOTS_QUE WHERE OTQUE_MODULE = :B3 AND OTQUE_DOCTYP 
= :B2 AND OTQUE_DOCRUNNUM = :B1 AND NVL(OTQUE_ERROR,'N') = 'N'

SQL_ID  8jng4anj19km4, child number 0
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |    16 |            |          |
|*  2 |   TABLE ACCESS FULL| TOTS_QUE |     1 |    16 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
   1 - :B3 (VARCHAR2(30), CSID=871): 'PR'
   2 - :B2 (VARCHAR2(30), CSID=871): 'PR'
   3 - :B1 (VARCHAR2(30), CSID=871): '8631426

Check the efficiency of the plan with the following query:

SELECT EXECUTIONS, ROUND(BUFFER_GETS/EXECUTIONS) FROM V$SQL 
   WHERE SQL_ID = '8jng4anj19km4';   
19808    29348

Executed 20,000 times since the last parse, average of 29,000 buffer gets per execution.  Note I am not going to go into the reasons why this inefficient plan is being chosen.

If we use the peeked bind variables to test, and force dynamic sampling, we can produce a more suitable plan:

SELECT /*+GATHER_PLAN_STATISTICS DYNAMIC_SAMPLING(11) */ COUNT(1) FROM TOTS_QUE 
WHERE OTQUE_MODULE = 'PR'  AND 
      OTQUE_DOCTYP = 'PR' AND 
      OTQUE_DOCRUNNUM = '8631426' AND 
      NVL(OTQUE_ERROR,'N') = 'N';

select * from table(dbms_xplan.display_cursor(FORMAT=>'ALLSTATS LAST'));

SQL_ID  87fcdy0wm5xhd, child number 0
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |            |      1 |        |      1 |00:00:00.01 |      59 |
|   1 |  SORT AGGREGATE                      |            |      1 |      1 |      1 |00:00:00.01 |      59 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TOTS_QUE   |      1 |      1 |      0 |00:00:00.01 |      59 |
|*  3 |    INDEX RANGE SCAN                  | OTQUE_IDX1 |      1 |     13 |      0 |00:00:00.01 |      59 |
-------------------------------------------------------------------------------------------------------------

Note the SQL_ID here.  Observe that with the index access, only 59 buffer gets are required, compared to 29,000 using full table scan.

Lets create a sql plan baseline for the inefficient plan.  Note it is created as not enabled, so it won’t actually be used.

DECLARE
   dummy pls_integer;
BEGIN
   dummy := DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE(
               sql_id => '8jng4anj19km4',
               enabled =>'NO');   
END;
/
SELECT sql_handle FROM DBA_SQL_PLAN_BASELINES ORDER BY CREATED DESC;
SQL_6d2f47d2acbf3c9b
..

The next bit it the important step, we can load the plan from the efficient plan (even though it has slightly different sql text, and a different sql_id) and associate it with the SQL_HANDLE  for the baseline we have just created.

DECLARE
   dummy pls_integer;
BEGIN
   dummy := DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE(
               sql_id => '87fcdy0wm5xhd', 
               sql_handle =>'SQL_6d2f47d2acbf3c9b', 
               enabled => 'YES');   
END;
/
DECLARE
   dummy pls_integer;
BEGIN
   dummy := DBMS_SPM.ALTER_SQL_PLAN_BASELINE (sql_HANDLE => 'SQL_6d2f47d2acbf3c9b',
                                              ATTRIBUTE_NAME=>'AUTOPURGE', 
                                              ATTRIBUTE_VALUE=>'NO');   
END;
/
Note the AUTO_PURGE attribute change is just personal preference to prevent these records getting deleted.

SELECT child_number, 
       SQL_PLAN_BASELINE, 
       EXECUTIONS, 
       ROUND(BUFFER_GETS/EXECUTIONS) FROM V$SQL 
WHERE SQL_ID = '8jng4anj19km4';
CHILD_NUMBER SQL_PLAN_BASELINE              EXECUTIONS ROUND(BUFFER_GETS/EXECUTIONS)                                                                
------------ ------------------------------ ---------- -----------------------------                                                                
           0                                     20648                         29371                                                                
           1 SQL_PLAN_6ubu7uaqbyg4v1b716894       2252                            12                                                                                                                                                                                                                      


Note a new sql child cursor has been created, associated with the baseline, and that it is much more efficient than the previous one.