Instance crash when not restarted through clusterware after patching

We had some issues, fortunately on a test instance, after applying a database
patch. With the help of a colleague (thanks Kei) I was able to identify what had
gone wrong and setup a testcase to demonstrate the issue. I think it is worth
sharing as it reveals some of the complexities of the relationships between
the components of the Oracle stack.

My setup consists of two database instances on the same host configured in Oracle Restart. To start the patching process I stop both instances. Note in particular the group owner of oracle binary is asmadmin.

[oracle@hkexdb01 ~]$ srvctl stop database -db PVJA -o abort
[oracle@hkexdb01 ~]$ srvctl stop database -db PVJB -o abort
[oracle@hkexdb01 ~]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 327575894 Jun 13 14:52 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

Next I apply database patch (I have omitted some details from the
patch session as they are not relevant). Observe that after patch completes,
oracle binary has been re-linked, and owner is now oinstall.

[oracle@hkexdb01 ~]$ cd ~/25397136
[oracle@hkexdb01 25397136]$ $ORACLE_HOME/OPatch/opatch apply
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-06-13_14-58-15PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   24732088  25397136  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying sub-patch '24732088' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

--DETAILS OMITTED

Patching component oracle.rdbms.install.plugins, 12.1.0.2.0...
Composite patch 25397136 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-06-13_14-58-15PM_1.log

OPatch succeeded.
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

After patching is complete, I now restart PVJA outside of
clusterware (note the group owner of oracle binary is still oinstall).

[oracle@hkexdb01 25397136]$ export ORACLE_SID=PVJA
[oracle@hkexdb01 25397136]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Tue Jun 13 15:00:54 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SYS@PVJA> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size                  2926472 bytes
Variable Size            1564297336 bytes
Database Buffers          570425344 bytes
Redo Buffers                9834496 bytes
Database mounted.
Database opened.
SYS@PVJA> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle

I then restart PVJB instance through clusterware. Note that clusterware
updates the group owner of the oracle binary.

srvctl start database -db PVJB 
ls -alrt $ORACLE_HOME/bin/oracle

[oracle@hkexdb01 25397136]$ srvctl start database -db PVJB 
[oracle@hkexdb01 25397136]$ ls -alrt $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 327813870 Jun 13 14:59 /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/oracle
[oracle@hkexdb01 25397136]$ 

This change in group ownership ‘breaks’ the PVJA instance. There are lots of errors in the alert log.

Errors in file /u01/app/oracle/diag/rdbms/pvja/PVJA/trace/PVJA_j000_64968.trc:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1001 (oinstall), current egid = 1006 (asmadmin)

Note the clusterware log seems to indicate the point at which this action took
place.

2017-06-13 14:55:47.233925 :CLSDYNAM:4045649664: [ora.pvjb.db]{1:31716:24755} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/12.1.0.2/grid cmdname = setasmgidwrap

This behaiour seems to match Bug 9784037 : SETASMGID CAUSING ORA-27303, which
was closed as “Not a Bug”. The provided workaround is “always remember to execute setasmgidwrap after doing relink all / patch apply in RDBMS Oracle Home”.
However I think just making sure that all instances are restarted through
clusterware is probably a better way to handle this.

Update, seems Frits Hoogland has run into this issue too, I now feel a bit better about my ‘mistake’.

OPatch 12.2.0.1.9 change in behaviour patching RAC nodes

We’ve had issues in the past patching the Oracle Homes on our Exadata nodes due to the fact that they were defined as part of a RAC cluster.
We don’t use RAC, so we have made sure we don’t encounter such problems in the past by unlinking them as described in the documentation.

However I have maintained a separate pair of linked Oracle homes for testing, and I have a reproducible test case of OPatch getting into a mess under certain conditions.
Running this testcase, I have noticed some change in behavior in this area with latest version of OPatch 12.2.0.1.9 compared to previous version 12.2.0.1.8.

Review the following testcase using OPatch 12.2.0.1.8. A few things to note.
First by default patches are applied to both nodes (this can be overridden by specifying -local flag).
Second the confirmation prompt regarding the remote node takes place after the local node has been patched.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch.backup/opatch apply
Oracle Interim Patch Installer version 12.2.0.1.8
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.8
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-36-03PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

This node is part of an Oracle Real Application Cluster.
Remote nodes: 'hkexdb02' 
Local node: 'hkexdb01'
Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '22652097' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

Patching component oracle.rdbms, 12.1.0.2.0...

Patching component oracle.rdbms.dbscripts, 12.1.0.2.0...
Skip copying to "/u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/admin/catfusrg.sql" because it is the same as 
the file in incoming patch "/home/oracle/patchtest/22652097/files/rdbms/admin/catfusrg.sql"

Patching in rolling mode.


The node 'hkexdb02' will be patched next.


Please shutdown Oracle instances running out of this ORACLE_HOME on 'hkexdb02'.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')

Is the node ready for patching? [y|n]
y
User Responded with: Y
Updating nodes 'hkexdb02' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-36-03PM/rac/make_cmds.txt" with actual path.
Running command on remote node 'hkexdb02': 
cd /u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/lib; /usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_2 || echo REMOTE_MAKE_FAILED::>&2 


The node 'hkexdb02' has been patched.  You can restart Oracle instances on it.


Finished patching in rolling mode.

Patch 22652097 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-36-03PM_1.log

OPatch succeeded.
[oracle@hkexdb01 22652097]$ 

Compare that with an attempt to apply same patch using OPatch 12.2.0.1.9 using the same command; by default, only the local node will be patched.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch/opatch apply       
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-39-06PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Is the local system ready for patching? [y|n]
n
User Responded with: N
NApply exits on request
UtilSession: NApply exits on user's request.
UtilSession system modification phase did not start: NApply exits on request
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-39-06PM_1.log

OPatch stopped on request.
[oracle@hkexdb01 22652097]$ 

To get 12.2.0.1.9 to patch remote nodes as well, -all_nodes switch must be provided, as demonstrated below.  Observe also that confirmation regards patching all nodes takes place before patching any of them.

[oracle@hkexdb01 22652097]$ $ORACLE_HOME/OPatch/opatch apply -all_nodes
Oracle Interim Patch Installer version 12.2.0.1.9
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/12.1.0.2/dbhome_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/12.1.0.2/dbhome_2/oraInst.loc
OPatch version    : 12.2.0.1.9
OUI version       : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-40-05PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   22652097  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

This node is part of an Oracle Real Application Cluster.
Remote nodes: 'hkexdb02' 
Local node: 'hkexdb01'
Please shut down Oracle instances running out of this ORACLE_HOME on all the nodes.
(Oracle Home = '/u01/app/oracle/product/12.1.0.2/dbhome_2')


Are all the nodes ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '22652097' to OH '/u01/app/oracle/product/12.1.0.2/dbhome_2'

Patching component oracle.rdbms, 12.1.0.2.0...

Patching component oracle.rdbms.dbscripts, 12.1.0.2.0...

Patching in all-node mode.

Updating nodes 'hkexdb02' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Instantiating the file "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/12.1.0.2/dbhome_2/.patch_storage/NApply/2017-05-25_15-40-05PM/rac/make_cmds.txt" with actual path.
Running command on remote node 'hkexdb02': 
cd /u01/app/oracle/product/12.1.0.2/dbhome_2/rdbms/lib; /usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_2 || echo REMOTE_MAKE_FAILED::>&2 


All nodes have been patched.  You may start Oracle instances on the local system and nodes 'hkexdb02' 

Patch 22652097 successfully applied.
Log file location: /u01/app/oracle/product/12.1.0.2/dbhome_2/cfgtoollogs/opatch/opatch2017-05-25_15-40-05PM_1.log

OPatch succeeded.
[oracle@hkexdb01 22652097]$ 

I think the change to prompt for confirmation before patching any nodes is an improvement.  Regarding the change of defaults for patching the local node vs patching all nodes, this doesn’t affect us (as we’ve unlinked our homes) but the change might cause some issues for some.

Update 2017-05-26, this change in behaviour is documented in the following MOS note:

OPatch: Behavior Changes starting in OPatch 12.2.0.1.9 and 11.2.0.3.16 releases (Doc ID 2232156.1)

Changed READ ANY TABLE audit behavior in 12.2

We have noticed a change in the behavior of READ ANY TABLE auditing in 12.2.

The following initial setup is performed as SYS:

SQL> AUDIT
  2      SELECT ANY TABLE;

Audit succeeded.

SQL>
SQL> AUDIT
  2      READ ANY TABLE;

Audit succeeded.

SQL>
SQL> CREATE USER tst_table_owner IDENTIFIED BY tst_table_owner
  2      ACCOUNT LOCK;

User TST_TABLE_OWNER created.

SQL>
SQL> CREATE USER tst_view_owner IDENTIFIED BY tst_view_owner
  2      ACCOUNT LOCK;

User TST_VIEW_OWNER created.

SQL>
SQL> GRANT
  2      READ ANY TABLE
  3  TO tst_view_owner WITH ADMIN OPTION;

Grant succeeded.

SQL>
SQL> GRANT
  2      SELECT ANY TABLE
  3  TO tst_view_owner WITH ADMIN OPTION;

Grant succeeded.

SQL>
SQL> CREATE USER tst_login IDENTIFIED BY tst_login;

User TST_LOGIN created.

SQL>
SQL> GRANT
  2      CREATE SESSION
  3  TO tst_login;

Grant succeeded.

SQL>
SQL> CREATE TABLE tst_table_owner.tst_table
  2      AS
  3          SELECT
  4              *
  5          FROM
  6              dual
  7          WHERE
  8              1 = 0;

Table TST_TABLE_OWNER.TST_TABLE created.

SQL>
SQL> CREATE VIEW tst_view_owner.tst_view AS
  2      SELECT
  3          *
  4      FROM
  5          tst_table_owner.tst_table;

View TST_VIEW_OWNER.TST_VIEW created.

SQL>
SQL> GRANT SELECT ON tst_view_owner.tst_view TO tst_login;

Grant succeeded.

SQL>
SQL> GRANT SELECT ON dba_capture TO tst_login;

Grant succeeded.

The following select statements are then run as newly created tst_login user:

SELECT
    COUNT(*)
FROM
    tst_view_owner.tst_view;

SELECT
    COUNT(*)
FROM
    dba_capture;

Querying the audit trail in 12.1 shows no entries, however in 12.2:

SQL> SELECT
  2      db_user,
  3      object_schema,
  4      object_name,
  5      sql_text,
  6      priv_used
  7  FROM
  8      dba_common_audit_trail
  9  WHERE
 10          extended_timestamp > SYSDATE - 2 / 24 / 60
 11      AND
 12          statement_type = 'SELECT';

DB_USER    OBJECT_SCHEMA    OBJECT_NAME               SQL_TEXT  PRIV_USED

TST_LOGIN  TST_TABLE_OWNER  TST_TABLE                           READ ANY TABLE
TST_LOGIN  TST_VIEW_OWNER   TST_VIEW
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              STREAMS$_APPLY_PROCESS
TST_LOGIN  SYS              STREAMS$_APPLY_MILESTONE
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              STREAMS$_APPLY_PROCESS
TST_LOGIN  SYS              STREAMS$_APPLY_MILESTONE
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              XSTREAM$_SERVER
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              STREAMS$_CAPTURE_PROCESS
TST_LOGIN  SYS              _DBA_CAPTURE
TST_LOGIN  SYSTEM           LOGMNR_SESSION$                     READ ANY TABLE
TST_LOGIN  SYS              DBA_LOGMNR_SESSION
TST_LOGIN  SYS              USER$
TST_LOGIN  SYS              _SXGG_DBA_CAPTURE
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              PROPS$
TST_LOGIN  SYS              DUAL
TST_LOGIN  SYS              STREAMS$_CAPTURE_PROCESS
TST_LOGIN  SYS              _DBA_CAPTURE
TST_LOGIN  SYSTEM           LOGMNR_SESSION$                     READ ANY TABLE
TST_LOGIN  SYS              DBA_LOGMNR_SESSION
TST_LOGIN  SYS              USER$
TST_LOGIN  SYS              _SXGG_DBA_CAPTURE
TST_LOGIN  SYS              DBA_CAPTURE

34 rows selected.

It appears that READ ANY TABLE is audited when querying a view, if the privilege is required by  the view owner to access the base table.

Oracle support have now confirmed that this is not the expected behavior, and have created Bug 26035911 : AUDIT RECORDS GENERATED EVEN WHEN THE SYSTEM PRIVILEGE IS NOT EXERCISED IN 12.2

A Consolidated List of 12cR2 Issues

I thought it would be useful to have  a consolidated list of issues we have run into, some of which I have already blogged about, some of which I haven’t. I will try to keep this page up-to-date moving forwards. Note we are not really using 12cR2 extensively, we have just upgraded OEM Repository and AWR Warehouse instances so far.

Recovery Manager

Upgrade Catalog Fails with RMAN-06444: error creating init_grsp_pdb_key

Oracle have now published a document about this issue:
UPGRADE CATALOG command from 12.1 to 12.2 Fails With RMAN-6004 and ORA-1422 (Doc ID 2252894.1)

Image Copies never marked as obsolete after datafile is removed

Bug 26115103 – REPORT OBSOLETE NOT SHOWING OLD DATAFILE COPY IN 12.2

Control File autobackups are never marked as obsolete

Bug 25943271: RMAN REPORT OBSOLETE DOES NOT REPORT CONTROLFILE BACKUP AS OBSOLETE

After upgrading catalog to 12.2,  resync failing with “ORA-20110: set stamp set count conflict”

Doc ID 20585616.8 has some information in this area.  This seems to be a problem that is specific to 12.1 multi-tenant instance on 12.2 catalog. It appears that rman is attempting to catalog the same backupset once for each PDB

Alain Fuhrer has run into this and has more details

Trace on the target system reveals problem seems to be caused by SQL like the following, which seem to ‘forget’ to provide the join conditions between the two tables queried:

SELECT BS.RECID , BS.STAMP , BS.SET_STAMP , BS.SET_COUNT , BS.BACKUP_TYPE , BS.INCREMENTAL_LEVEL , BS.PIECES , START_TIME , COMPLETION_TIME , CONTROLFILE_INCLUDED , BS.INPUT_FILE_SCAN_ONLY , KEEP_UNTIL , DECODE(BS.KEEP_OPTIONS , ‘LOGS’ , :b1 , ‘NOLOGS’ , :b2 , ‘BACKUP_LOGS’ , :b3 , 0 ) KEEP_OPTIONS , BS.BLOCK_SIZE , BS.MULTI_SECTION , BS.GUID , DECODE(BS.CON_ID , :b4 , 1 , 0 ) DROPPED_PDB FROM V$BACKUP_SET BS , V$CONTAINERS PDB WHERE BS.RECID BETWEEN :b5 AND :b6 AND ( BS.STAMP >= :b7 OR BS.RECID = :b6 ) AND BS.STAMP >= :b8 AND BS.FOR_XTTS != ‘YES’ ORDER BY BS.RECID

 

AWR and AWR Warehouse

AWR Warehouse load process failing with ORA-600 after upgrading repository

The error message is [kewrspbr_2: wrong last partition]. Doc ID 2020227.1 describes the fix for a similar issue. I found partitioning was wrong on some on some new 12.2 AWR tables so I recreated them, added a partition for each database contributing to AWR Warehouse. I have been unable to reproduce this issue on a test system.

WRH$_SGASTAT_U becoming unusable

Only affects database upgraded from 12.1.0.2, workaround is to re-create the index.
BUG 25954054 – WRH$_SGASTAT_U BECOMING UNUSABLE STATE IN UPGRADED DB
A patch is now available for this issue.

AWR Transfer Task Failing with ORA-28040

Set SQLNET.ALLOWED_LOGON_VERSION_SERVER to 11

Enterprise Manager

TNS-12599: TNS:cryptographic checksum mismatch in alert log of 12.2 database from connections from OEM

No functional impact. Bug number 25915038 created. Setting the following in sqlnet.ora of target database suppresses the message:

  • SQLNET.CRYPTO_CHECKSUM_SERVER=rejected
  • SQLNET.ENCRYPTION_SERVER=rejected

No Such Metadata in Enterprise Manager after upgrading database

This is a bug on calculation of database version from db plugin.  Simplest solution is to upgrade db plugin to 13.2.2 or later on database host.

Miscellaneous

ORA-20001: Statistics Advisor: Invalid task name for the current user

Seems to be on fresh created database. Solution per Doc ID 2127675.1 is to run  dbms_stats.init_package.

READ ANY TABLE audit records

Bug 26035911 : AUDIT RECORDS GENERATED EVEN WHEN THE SYSTEM PRIVILEGE IS NOT EXERCISED IN 12.2

12cR2 Incrementally Updated Backups and dropped datafiles

We have just noticed a difference in behavior in 12cR2 with regards to image copies being marked as obsolete after the backup is updated past the drop of a datafile.
I won’t describe the feature itself, if necessary you can read up at oracle-base or the Oracle documentation.

First review output from the testcase on a 12.1 instance, note that after dropping the datafile, and updating the backup past this point, the datafile copy is marked as obsolete:

RMAN> show all;

RMAN configuration parameters for database with db_unique_name ADAPTEST are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE RMAN OUTPUT TO KEEP FOR 7 DAYS; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/snapcf_ADAPTEST.f'; # default

RMAN> create tablespace test datafile size 1m autoextend off;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 2017-04-19 12:47:09
using channel ORA_DISK_1
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 6 found
no parent backup or copy of datafile 5 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=+DATAC1/ADAPTEST/DATAFILE/system.401.941128155
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSTEM_FNO-1_0hs2302d tag=TEST_OBSOLETE RECID=19 STAMP=941719632
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATAC1/ADAPTEST/DATAFILE/sysaux.397.941128123
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSAUX_FNO-3_0is2302k tag=TEST_OBSOLETE RECID=20 STAMP=941719639
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+DATAC1/ADAPTEST/DATAFILE/undotbs1.403.941128201
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-UNDOTBS1_FNO-4_0js2302r tag=TEST_OBSOLETE RECID=21 STAMP=941719644
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00006 name=+DATAC1/ADAPTEST/DATAFILE/users.399.941128199
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-USERS_FNO-6_0ks2302s tag=TEST_OBSOLETE RECID=22 STAMP=941719644
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=+DATAC1/ADAPTEST/DATAFILE/test.1051.941719625
output file name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-TEST_FNO-5_0ls2302t tag=TEST_OBSOLETE RECID=23 STAMP=941719645
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 2017-04-19 12:47:26

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:26
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-00 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:27

RMAN> drop tablespace test including contents and datafiles;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 2017-04-19 12:47:41
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATAC1/ADAPTEST/DATAFILE/system.401.941128155
input datafile file number=00003 name=+DATAC1/ADAPTEST/DATAFILE/sysaux.397.941128123
input datafile file number=00004 name=+DATAC1/ADAPTEST/DATAFILE/undotbs1.403.941128201
input datafile file number=00006 name=+DATAC1/ADAPTEST/DATAFILE/users.399.941128199
channel ORA_DISK_1: starting piece 1 at 2017-04-19 12:47:41
channel ORA_DISK_1: finished piece 1 at 2017-04-19 12:47:42
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1 tag=TEST_OBSOLETE comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2017-04-19 12:47:42

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:42
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-01 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:43

RMAN> recover copy of database with tag 'test_obsolete';

Starting recover at 2017-04-19 12:47:55
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile copies to recover
recovering datafile copy file number=00001 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSTEM_FNO-1_0hs2302d
recovering datafile copy file number=00003 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-SYSAUX_FNO-3_0is2302k
recovering datafile copy file number=00004 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-UNDOTBS1_FNO-4_0js2302r
recovering datafile copy file number=00006 name=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-USERS_FNO-6_0ks2302s
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1
channel ORA_DISK_1: piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1 tag=TEST_OBSOLETE
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished recover at 2017-04-19 12:47:56

Starting Control File and SPFILE Autobackup at 2017-04-19 12:47:56
piece handle=/u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-02 comment=NONE
Finished Control File and SPFILE Autobackup at 2017-04-19 12:47:57

RMAN> report obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of obsolete backups and copies
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Datafile Copy        23     2017-04-19 12:47:25 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/data_D-ADAPTEST_I-757536437_TS-TEST_FNO-5_0ls2302t
Backup Set           7      2017-04-19 12:47:26
  Backup Piece       7      2017-04-19 12:47:26 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-00
Backup Set           8      2017-04-19 12:47:42
  Backup Piece       8      2017-04-19 12:47:42 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/0ns2303d_1_1
Backup Set           9      2017-04-19 12:47:42
  Backup Piece       9      2017-04-19 12:47:42 /u01/app/oracle/product/12.1.0.2/dbhome_2/dbs/c-757536437-20170419-01

Compare this with output from the same testcase on 12.2 instance, the datafile copy is not marked as obsolete:

RMAN> show all;

RMAN configuration parameters for database with db_unique_name PVJTEST are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE RMAN OUTPUT TO KEEP FOR 7 DAYS; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/12.2.0.1/dbhome/dbs/snapcf_PVJTEST.f'; # default

RMAN> create tablespace test datafile size 1m autoextend off;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 19-APR-17
using channel ORA_DISK_1
no parent backup or copy of datafile 3 found
no parent backup or copy of datafile 1 found
no parent backup or copy of datafile 4 found
no parent backup or copy of datafile 7 found
no parent backup or copy of datafile 5 found
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/u02/oradata/PVJTEST/sysaux01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSAUX_FNO-3_0vs2304k tag=TEST_OBSOLETE RECID=36 STAMP=941719708
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/u02/oradata/PVJTEST/system01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSTEM_FNO-1_10s23053 tag=TEST_OBSOLETE RECID=37 STAMP=941719720
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/u02/oradata/PVJTEST/undotbs01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-UNDOTBS1_FNO-4_11s2305a tag=TEST_OBSOLETE RECID=38 STAMP=941719724
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile copy
input datafile file number=00007 name=/u02/oradata/PVJTEST/users01.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-USERS_FNO-7_12s2305d tag=TEST_OBSOLETE RECID=39 STAMP=941719726
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=/u02/oradata/PVJTEST/datafile/o1_mf_test_dhfv0gk3_.dbf
output file name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-TEST_FNO-5_13s2305f tag=TEST_OBSOLETE RECID=40 STAMP=941719727
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-0e comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> drop tablespace test including contents and datafiles;

Statement processed

RMAN> backup incremental level 1 for recover of copy with tag 'test_obsolete' database;

Starting backup at 19-APR-17
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u02/oradata/PVJTEST/sysaux01.dbf
input datafile file number=00001 name=/u02/oradata/PVJTEST/system01.dbf
input datafile file number=00004 name=/u02/oradata/PVJTEST/undotbs01.dbf
input datafile file number=00007 name=/u02/oradata/PVJTEST/users01.dbf
channel ORA_DISK_1: starting piece 1 at 19-APR-17
channel ORA_DISK_1: finished piece 1 at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1 tag=TEST_OBSOLETE comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:15
Finished backup at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-0f comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> recover copy of database with tag 'test_obsolete';

Starting recover at 19-APR-17
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile copies to recover
recovering datafile copy file number=00001 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSTEM_FNO-1_10s23053
recovering datafile copy file number=00003 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-SYSAUX_FNO-3_0vs2304k
recovering datafile copy file number=00004 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-UNDOTBS1_FNO-4_11s2305a
recovering datafile copy file number=00007 name=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/data_D-PVJTEST_I-2122366327_TS-USERS_FNO-7_12s2305d
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1
channel ORA_DISK_1: piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1 tag=TEST_OBSOLETE
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
Finished recover at 19-APR-17

Starting Control File and SPFILE Autobackup at 19-APR-17
piece handle=/u01/app/oracle/product/12.2.0.1/dbhome/dbs/c-2122366327-20170419-10 comment=NONE
Finished Control File and SPFILE Autobackup at 19-APR-17

RMAN> report obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of obsolete backups and copies
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set           22     19-APR-17         
  Backup Piece       22     19-APR-17          /u01/app/oracle/product/12.2.0.1/dbhome/dbs/15s2305t_1_1

Migrating AWR Warehouse Repository to a new OEM Target

AWR Warehouse is tightly integrated with Enterprise Manager, both the database which acts as the repository plus any databases that will contribute to it need to be discovered as targets within OEM before they can be used. If you are already using OEM to manage your databases (and you should be), this means that configuration is as simple as selecting the appropriate instance from a drop-down list of discovered targets; however it does have some disadvantages.

One of these disadvantages is that if the Enterprise Manger target of the repository database changes, then AWR Warehouse configuration is lost with it. The target could be changed due agent problems on the host forcing you to re-discover it. Even though it is the same database it is treated it as a new target.

If you attempt to re-configure by pointing it at the new target, a new AWRW instance is created (note the EM_ID field on many of the internal tables refers to this instance id). It is possible to re-add all source databases to this new instance, however it involves a lot of manual steps manipulating the internal tables to get everything working.  While I was researching this with a view to providing a step by step guide I found an easier way.

I will reproduce this situation, and walk through my solution. My AWRW repository is called OMR, and I have a database DGA contributing to it.

I remove instance OMR from OEM, and then re-add it.
When I go to the AWR Warehouse page, I can still see my source database is still listed, but note the error message:
Screen Shot 2017-04-17 at 11.55.06 AM

To fix this, first identify the OEM GUID of the newly configured repository database:

 
SQL> SELECT
  2      target_guid
  3  FROM
  4      sysman.mgmt$target
  5  WHERE
  6          target_name = 'OMR'
  7      AND
  8          target_type = 'oracle_database';

TARGET_GUID
--------------------------------
4CAF735767113C73E0530201A8C0FFB8

Then update the AWRW repository table in OEM with this new GUID:

SQL> UPDATE SYSMAN.DB_CAW_REPOS_E SET TARGET_GUID = '4CAF735767113C73E0530201A8C0FFB8';
1 row updated.

SQL> COMMIT;

Commit complete.

Doing this seems to bypass the re-configuration process. I have tested that everything is working by performing “Upload Snapshots Now”, and all seems fine, however I haven’t performed extensive testing, and I would suggest exercising the usual caution before updating internal OEM tables.

AWR Warehouse and Data Guard – Conclusions

Unfortunately AWR Warehouse and Data Guard do not work together seamlessly, however by understanding some of the implementation details, and following some simple steps you can help to mitigate most problems that may be encountered.

It is not possible to add an instance to AWRW when it is in standby mode, worse still if you do configure the primary and then switch over to an instance not configured in AWRW, the extract jobs will continue to dump files to local storage, but those files will never be transferred or loaded.

For this reason, I would suggest if setting up a pair of instances in data guard configuration from scratch to consider adding both to AWRW as part of initial configuration process. If this is an existing system where you cannot switchover to add the standby to AWRW, you should consider adding it the next time you do switchover.

Whenever you add a standby instance to AWRW make sure to backup the extract metadata table beforehand, and restore it before performing the first extract (remember to fix the sequence on this table too), to make sure you do not extract the same snapshots from multiple instances.

While it is possible to update the mapping table to force AWRW to upload from both instances to the same DBID, I think it is not worth the risk you run of breaking things by performing such changes. The other downside is that removal of either instance from AWRW triggers the removal of AWR history from both instances from the repository.

The downside of having the different instances of the same database mapped to different DBIDs is extra complexity in queries you run against the AWRW for analysis, especially if you switchover for significant periods of time. In addition, AWR reports or ASH analytics will not be work on consolidation of both instances, you will have to perform those tasks against each instance separately.

In addition as long as you are on DB plugins 13.2.2 or later (and if you are not you should think about upgrading), you may also consider updating the properties file such that data is extracted to a location accessible from both instances. That way files extracted from one instance will still get uploaded from the other after switchover. If you do not, you may need to perform some manual steps to ensure snapshots are not missed during switchover.

AWR Warehouse and Data Guard (Part Five)

Continuing the investigation, I wanted to check the effect of deleting one of the two data guard instances from AWRW; remember I have updated the mapping table such that both DGA and DGA both map to the same DBID.

We can check from load metadata and AWR itself which snapshots have been loaded

SQL> SELECT
    dump_id,
    dbid,
    target_name,
    begin_snap_id,
    end_snap_id
FROM
    dbsnmp.caw_load_metadata
ORDER BY dump_id;  

   DUMP_ID     DBID TARGET_NAM BEGIN_SNAP_ID END_SNAP_ID
---------- ---------- ---------- ------------- -----------
    14 1852644603 DGA            12        38
    15 1852644603 DGB            39        41
    16 1852644603 DGB            42        44

SQL> SELECT
    MIN(snap_id),
    MAX(snap_id),
    COUNT(*)
FROM
    dba_hist_snapshot
WHERE
    dbid = 1852644603
ORDER BY snap_id;  

MIN(SNAP_ID) MAX(SNAP_ID)   COUNT(*)
------------ ------------ ----------
      12           44      33

So we have 33 snapshots, loaded from both DGA and DGB.
I then delete DGB from AWRW via the GUI. After waiting a few minutes for the scheduler task that actually does the deletion to kick in, I recheck the AWR data.

SQL> SELECT
    MIN(snap_id),
    MAX(snap_id),
    COUNT(*)
FROM
    dba_hist_snapshot
WHERE
    dbid = 1852644603
ORDER BY snap_id;
  
MIN(SNAP_ID) MAX(SNAP_ID)   COUNT(*)
------------ ------------ ----------
                   0

Because databases are mapped to the same DBID, all AWR history for both databases has been removed from AWR Warehouse.

One final thing I wanted to check was that with the updated mapping record, AWRW data was really accessible via OEM.

I re-added DGB back to AWRW, mapped to its real DBID using the steps before.
From the AWRW page I highlight DGA and select “Compare Periods->Compare Periods Report”. (Note this is a powerful feature of AWRW probably deserving of it’s own entry):

compare_periods

I make sure AWR Data Source is “Historical – AWR Warehouse”:

select_snapshot

I then check I can pick a pair of uploaded snapshots, in this case 49 and 50.
For the second period I can select a different database. Note in particular how this screen shows the AWRW mappings  I am then able to pick the exact same snapshots, 49 and 50, for DGB.
I am able to run the AWR comparison report for the same snapshots, but associated with the different databases.

Index WRH$_SGASTAT_U has status UNUSABLE after upgrade to 12.2

Update 2016-04-26: this is new bug 25954054 – WRH$_SGASTAT_U BECOMING UNUSABLE STATE IN UPGRADED DB.

After upgrading our OMR and AWRW instances from 12.1 to 12.2, our monitoring revealed that index SYS.WRH$_SGASTAT_U was becoming unusable at regular intervals.
The first couple of times, we just rebuilt it in the hope that the problem would just go away(!); of course it didn’t, so I had to do a little bit of investigation.

Reviewing the alert log identified the exact time that the problem had occurred:

2017-04-12T08:11:43.674485+08:00
Some indexes or index [sub]partitions of table SYS.WRH$_SGASTAT have been marked unusable

Grepping for “SGASTAT” in the trace directory, also revealed that the index being unusable was causing inserts into AWR had been failing:

*** 2017-04-12T12:00:18.338061+08:00
*** SESSION ID:(289.8864) 2017-04-12T12:00:18.338092+08:00
*** CLIENT ID:() 2017-04-12T12:00:18.338103+08:00
*** SERVICE NAME:(SYS$BACKGROUND) 2017-04-12T12:00:18.338108+08:00
*** MODULE NAME:(MMON_SLAVE) 2017-04-12T12:00:18.338119+08:00
*** ACTION NAME:(Auto-Flush Slave Action) 2017-04-12T12:00:18.338125+08:00
*** CLIENT DRIVER:() 2017-04-12T12:00:18.338132+08:00

*** KEUUXS - encountered error: (ORA-01502: index 'SYS.WRH$_SGASTAT_U' or partition of such index is in unusable state
)
  *** SQLSTR: total-len=464, dump-len=464,
      STR={insert into  wrh$_sgastat  (dbid, per_pdb, con_dbid, snap_id, instance_number, pool, name, bytes)  select distinct  :dbid, 0 per_pdb,  :srcdbid  con_dbid,  :snap_id, :instance_number, pool, name, bytes FROM     (select con_id, pool, name, bytes,             100*(bytes) / (sum(bytes) over (partition by pool)) part_pct        from  v$sgastat      ) sgastat  WHERE (1 = 1)   and part_pct >= 1      or pool is null      or name = 'free memory' or name = 'row cache' }
*** KEWRAFM1: Error=13509 encountered by kewrfteh

As we still had the redo logs for the period of time identified by the alert log, I fired up logminer, and right after a bunch of inserts into WRH$_SGASTAT, I found the following statement:

alter table WRH$_SGASTAT drop partition WRH$_SGASTA_1521710113_17057;

I am guessing this is part of the standard mechanism for purging AWR data based on configured retention settings.
A quick check reveals that of the partitioned AWR base tables, 3 have indexes that are not partitioned:

SQL> SELECT                                                                                                                                                                                                          
  2     partitioned,                                                                                                                                                                                                 
  3     COUNT(*)                                                                                                                                                                                                     
  4  FROM                                                                                                                                                                                                            
  5     dba_indexes                                                                                                                                                                                                  
  6  WHERE                                                                                                                                                                                                           
  7     index_type = 'NORMAL'                                                                                                                                                                                        
  8  AND                                                                                                                                                                                                             
  9     table_name IN (                                                                                                                                                                                              
 10        SELECT                                                                                                                                                                                                    
 11           table_name                                                                                                                                                                                             
 12        FROM                                                                                                                                                                                                      
 13           dba_tables                                                                                                                                                                                             
 14        WHERE                                                                                                                                                                                                     
 15           table_name LIKE 'WRH$\_%' ESCAPE '\'                                                                                                                                                                   
 16        AND                                                                                                                                                                                                       
 17           partitioned = 'YES'                                                                                                                                                                                    
 18     )                                                                                                                                                                                                            
 19  GROUP BY                                                                                                                                                                                                        
 20     partitioned;                                                                                                                                                                                                 
                                                                                                                                                                                                                     
PAR   COUNT(*)                                                                                                                                                                                                       
--- ----------                                                                                                                                                                                                       
NO           3                                                                                                                                                                                                       
YES         35                                                                                                                                                                                                       

SQL> SELECT                                                                                                                                                                                                          
  2     table_owner,                                                                                                                                                                                                 
  3     table_name,                                                                                                                                                                                                  
  4     index_name                                                                                                                                                                                                   
  5  FROM                                                                                                                                                                                                            
  6     dba_indexes                                                                                                                                                                                                  
  7  WHERE                                                                                                                                                                                                           
  8     index_type = 'NORMAL'                                                                                                                                                                                        
  9  AND                                                                                                                                                                                                             
 10     table_name IN (                                                                                                                                                                                              
 11        SELECT                                                                                                                                                                                                    
 12           table_name                                                                                                                                                                                             
 13        FROM                                                                                                                                                                                                      
 14           dba_tables                                                                                                                                                                                             
 15        WHERE                                                                                                                                                                                                     
 16           table_name LIKE 'WRH$\_%' ESCAPE '\'                                                                                                                                                                   
 17        AND                                                                                                                                                                                                       
 18           partitioned = 'YES'                                                                                                                                                                                    
 19     )                                                                                                                                                                                                            
 20  AND partitioned = 'NO';                                                                                                                                                                                         
                                                                                                                                                                                                                     
TABLE_OWNER                    TABLE_NAME                     INDEX_NAME                                                                                                                                             
------------------------------ ------------------------------ ------------------------------                                                                                                                         
SYS                            WRH$_SGASTAT                   WRH$_SGASTAT_U                                                                                                                                         
SYS                            WRH$_CON_SYSMETRIC_HISTORY     WRH$_CON_SYSMET_HISTORY_INDEX                                                                                                                          
SYS                            WRH$_SYSMETRIC_HISTORY         WRH$_SYSMETRIC_HISTORY_INDEX                                                                                                                           

Running the same query on both 11.2 and 12.1 instances reveals no such mis-configured indexes We are following up with Oracle Support, but I would be interested to know if anyone else can check my findings on their 12.2 instances.

Working with Oracle Support we have worked around the issue using the following script:

ALTER TABLE wrh$_sgastat DROP CONSTRAINT wrh$_sgastat_u;

ALTER TABLE wrh$_sgastat ADD CONSTRAINT wrh$_sgastat_u UNIQUE ( dbid,snap_id,instance_number,name,pool,con_dbid )
   USING INDEX
      LOCAL
   TABLESPACE sysaux;

In addition, I believe I have found the bug in file $ORACLE_HOME/rdbms/admin/c1201000.sql at line 3036:

alter table wrh$_sgastat drop constraint wrh$_sgastat_u;

alter table wrh$_sgastat add constraint wrh$_sgastat_u
        unique (dbid, snap_id, instance_number, name, pool, con_dbid, stattype);

AWR Warehouse and Data Guard (Part Four)

After a brief interlude I wanted to follow up on something I noticed on John Hallas’s blog.

He mentioned that specifying a shared location for the extract metadata did not work, and was waiting on a fix for bug number 21826657. I checked that this bug should be fixed in db plugins version 13.2.2 and that my test hosts dga and dgb are both running this version.

To test whether the fix was effective, I first updated the extract properties table to specify a location  shared between both hosts, /home/oracle/awrw, generated a couple of snapshots (current primary is DGA) and then ran an extract process.

SQL> update dbsnmp.caw_extract_properties
  2     set property_value = '/home/oracle/awrw'
  3     where property_name = 'dump_dir_1';

1 row updated.

SQL> commit;

Commit complete.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> exec dbsnmp.mgmt_caw_extract.run_job_now;

PL/SQL procedure successfully completed.

I double-check the files are extracted to /home/oracle/awrw as expected. You can see from the filename that this corresponds to snapshots with ids between 42 and 44.

[oracle@dgb awrw]$ ls -alrt /home/oracle/awrw
total 12796
drwx------. 20 oracle oinstall     4096 Apr  8 12:21 ..
drwxr-xr-x.  1 oracle oinstall      136 Apr  9  2017 .
-rw-r-----.  1 oracle oinstall 13066240 Apr  9  2017 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp
-rw-r--r--.  1 oracle oinstall    31656 Apr  9  2017 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.log
[oracle@dgb awrw]$

I then switch-over to DGB and then performed “Upload Snapshots Now”.
Reviewing the steps for the Transfer Task looks promising, below are the outputs of a couple of the transfer subtask; first of all createArchive:

conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dgb.localdomain)(PORT=1521)))(CONNECT_DATA=(SID=DG))), user: system, role: normal
Connected to database
Number of files ready for transfer: 1
Maximum total size of dump files: 5368709120
file: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp size(bytes): 13066240
Number of files picked up for transfer: 1
Updated status of dump files, ready to transfer
Creating archive file using [/app/oracle/agent/agent_13.2.0.0.0/bin/zip -v 4CA58C2B03FF5F03E0530201A8C0052E.zip 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp]
  adding: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp .	(in=13066240) (out=1116898) (deflated 91%)
total bytes=13066240, compressed=1116898 -> 91% savings
Created archive 4CA58C2B03FF5F03E0530201A8C0052E.zip
Disconnected from database

Next prepareDestinationFiles:

Extracting dump files from /app/oracle/agent/agent_inst/awr_t/4CA58C2B03FF5F03E0530201A8C0052E.zip
Extracted 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp
extract complete
Deleted archive 4CA58C2B03FF5F03E0530201A8C0052E.zip
conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.2)(PORT=1521)))(CONNECT_DATA=(SID=OMR))), user: sys, role: sysdba
Connected to database
dumpfile: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp, emid: 1, dbid: 1852644603, bid: 42, eid: 44, size: 13066240
source target: DGB, type: oracle_database
Added dump files to the import queue
Disconnected from database

The output of the loadAWR task reveals that 3 snapshots have been successfully loaded to AWRW:

conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.2)(PORT=1521)))(CONNECT_DATA=(SID=OMR))), user: sys, role: sysdba
Connected to database
Loaded 3 snapshots in the AWR Warehouse.
Disconnected from database

Reviewing load metadata you can see that the snapshots that were extracted from DGA have been successfully loaded from DGB.

SQL> select dump_id, target_name, begin_snap_id, end_snap_id, status from dbsnmp.caw_load_metadata order by dump_id;

 SQL> select begin_snap_id, end_snap_id, target_name, status from dbsnmp.caw_load_metadata order by dump_id;

	   12	       38 DGA		      3
	   39	       41 DGB		      3
	   42	       44 DGB		      3
SQL>

So it seems that it is now OK to update the location that the dumpfiles are extracted to. If this is a shared location accessible by both hosts in dataguard configuration, then as we have proved, any data extracted from one host, can subsequently be transferred and loaded from the another host without requiring manual intervention.
This may be of use with AWRW in a RAC configuration (something I haven’t yet investigated).