AWR Warehouse and Data Guard – Conclusions

Unfortunately AWR Warehouse and Data Guard do not work together seamlessly, however by understanding some of the implementation details, and following some simple steps you can help to mitigate most problems that may be encountered.

It is not possible to add an instance to AWRW when it is in standby mode, worse still if you do configure the primary and then switch over to an instance not configured in AWRW, the extract jobs will continue to dump files to local storage, but those files will never be transferred or loaded.

For this reason, I would suggest if setting up a pair of instances in data guard configuration from scratch to consider adding both to AWRW as part of initial configuration process. If this is an existing system where you cannot switchover to add the standby to AWRW, you should consider adding it the next time you do switchover.

Whenever you add a standby instance to AWRW make sure to backup the extract metadata table beforehand, and restore it before performing the first extract (remember to fix the sequence on this table too), to make sure you do not extract the same snapshots from multiple instances.

While it is possible to update the mapping table to force AWRW to upload from both instances to the same DBID, I think it is not worth the risk you run of breaking things by performing such changes. The other downside is that removal of either instance from AWRW triggers the removal of AWR history from both instances from the repository.

The downside of having the different instances of the same database mapped to different DBIDs is extra complexity in queries you run against the AWRW for analysis, especially if you switchover for significant periods of time. In addition, AWR reports or ASH analytics will not be work on consolidation of both instances, you will have to perform those tasks against each instance separately.

In addition as long as you are on DB plugins 13.2.2 or later (and if you are not you should think about upgrading), you may also consider updating the properties file such that data is extracted to a location accessible from both instances. That way files extracted from one instance will still get uploaded from the other after switchover. If you do not, you may need to perform some manual steps to ensure snapshots are not missed during switchover.

AWR Warehouse and Data Guard (Part Four)

After a brief interlude I wanted to follow up on something I noticed on John Hallas’s blog.

He mentioned that specifying a shared location for the extract metadata did not work, and was waiting on a fix for bug number 21826657. I checked that this bug should be fixed in db plugins version 13.2.2 and that my test hosts dga and dgb are both running this version.

To test whether the fix was effective, I first updated the extract properties table to specify a location  shared between both hosts, /home/oracle/awrw, generated a couple of snapshots (current primary is DGA) and then ran an extract process.

SQL> update dbsnmp.caw_extract_properties
  2     set property_value = '/home/oracle/awrw'
  3     where property_name = 'dump_dir_1';

1 row updated.

SQL> commit;

Commit complete.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> exec dbsnmp.mgmt_caw_extract.run_job_now;

PL/SQL procedure successfully completed.

I double-check the files are extracted to /home/oracle/awrw as expected. You can see from the filename that this corresponds to snapshots with ids between 42 and 44.

[oracle@dgb awrw]$ ls -alrt /home/oracle/awrw
total 12796
drwx------. 20 oracle oinstall     4096 Apr  8 12:21 ..
drwxr-xr-x.  1 oracle oinstall      136 Apr  9  2017 .
-rw-r-----.  1 oracle oinstall 13066240 Apr  9  2017 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp
-rw-r--r--.  1 oracle oinstall    31656 Apr  9  2017 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.log
[oracle@dgb awrw]$

I then switch-over to DGB and then performed “Upload Snapshots Now”.
Reviewing the steps for the Transfer Task looks promising, below are the outputs of a couple of the transfer subtask; first of all createArchive:

conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dgb.localdomain)(PORT=1521)))(CONNECT_DATA=(SID=DG))), user: system, role: normal
Connected to database
Number of files ready for transfer: 1
Maximum total size of dump files: 5368709120
file: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp size(bytes): 13066240
Number of files picked up for transfer: 1
Updated status of dump files, ready to transfer
Creating archive file using [/app/oracle/agent/agent_13.2.0.0.0/bin/zip -v 4CA58C2B03FF5F03E0530201A8C0052E.zip 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp]
  adding: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp .	(in=13066240) (out=1116898) (deflated 91%)
total bytes=13066240, compressed=1116898 -> 91% savings
Created archive 4CA58C2B03FF5F03E0530201A8C0052E.zip
Disconnected from database

Next prepareDestinationFiles:

Extracting dump files from /app/oracle/agent/agent_inst/awr_t/4CA58C2B03FF5F03E0530201A8C0052E.zip
Extracted 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp
extract complete
Deleted archive 4CA58C2B03FF5F03E0530201A8C0052E.zip
conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.2)(PORT=1521)))(CONNECT_DATA=(SID=OMR))), user: sys, role: sysdba
Connected to database
dumpfile: 1_4CB89422459F0DF3E0530401A8C065D5_1852644603_42_44.dmp, emid: 1, dbid: 1852644603, bid: 42, eid: 44, size: 13066240
source target: DGB, type: oracle_database
Added dump files to the import queue
Disconnected from database

The output of the loadAWR task reveals that 3 snapshots have been successfully loaded to AWRW:

conn_url: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.2)(PORT=1521)))(CONNECT_DATA=(SID=OMR))), user: sys, role: sysdba
Connected to database
Loaded 3 snapshots in the AWR Warehouse.
Disconnected from database

Reviewing load metadata you can see that the snapshots that were extracted from DGA have been successfully loaded from DGB.

SQL> select dump_id, target_name, begin_snap_id, end_snap_id, status from dbsnmp.caw_load_metadata order by dump_id;

 SQL> select begin_snap_id, end_snap_id, target_name, status from dbsnmp.caw_load_metadata order by dump_id;

	   12	       38 DGA		      3
	   39	       41 DGB		      3
	   42	       44 DGB		      3
SQL>

So it seems that it is now OK to update the location that the dumpfiles are extracted to. If this is a shared location accessible by both hosts in dataguard configuration, then as we have proved, any data extracted from one host, can subsequently be transferred and loaded from the another host without requiring manual intervention.
This may be of use with AWRW in a RAC configuration (something I haven’t yet investigated).

AWR Warehouse and Data Guard (Interlude)

Note, in preparing next entry, I reviewed the results from previous entries and found that although extracts were being transferred successfully they had all failed to load; I should have noticed they had status 4 (IMPORT_FAILED), in the load metadata table.

Reviewing dbms_scheduler_job_run_details of the load task revealed the following error:

"move error: 
ORA-20105: Unable to move AWR data to SYS
1852644603 1_4CB614B101EE2E5EE0530401A8C066F0_1852644603_12_37.dmp 1 DGA oracle_database 1852644603"

I removed both databases from ARWR, and playing around with the internal functions used to register and un-register the databases revealed that the partitions had got in a mess.

 
SQL> exec dbms_swrf_internal.unregister_database(1852644603);
SQL> exec dbms_swrf_internal.register_database(1852644603);
BEGIN dbms_swrf_internal.register_database(1852644603); END;

*
ERROR at line 1:
ORA-13509: error encountered during updates to a AWR table
ORA-14012: resulting partition name conflicts with that of an existing partition
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 105
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 184
ORA-06512: at line 1

To resolve this, I unregistered the database again using the internal function, then identified the ‘orphan’ partitions causing problems using the following SQL (1852644603 is the DBID of the problem database):

 
SELECT * FROM DBA_TAB_PARTITIONS WHERE PARTITION_NAME LIKE '%1852644603%'

Partitions for the unregistered database still existed for the following tables:

  • WRH$_CON_SYSMETRIC_HISTORY
  • WRH$_CON_SYSSTAT
  • WRH$_CON_SYSTEM_EVENT
  • WRH$_CON_SYS_TIME_MODEL
  • WRM$_PDB_IN_SNAP

I then dropped these problem partitions (this is just a play database, that is why I am able to be so reckless modifying internal Oracle tables, do you own research and testing before making such changes, and make sure you understand the risks.)

After I did this, I was able re-add DGA to ARWR and upload some snapshots successfully; this time I actually checked the records existed in AWRW using the following query:

SQL> SELECT DISTINCT DBID FROM DBA_HIST_SNAPSHOT;

1585153544
1852644603
SQL> 

I performed the same steps as before (backup extract metadata table, add DGB to AWRW, switchover to DGB, change the mapping record for DGA, restore the extract record, Upload Snapshots Now, then restore the mapping record for DGA). Snapshots from DGB also uploaded successfully with the original DBID>

It distinctly possible that I messed up the partitioning of the AWR tables by my hacks however there are a few things that make me think this is not the case. The original upload from DGA failed (this was before I had made any non-standard changes). Also repeating my steps after manually fixing the partitions, everything worked fine. It is also worth noting that the tables causing the problem are all new in 12.2 (DGA and DGB are fresh installed 12.2 instances, the AWR Warehouse repository is 12.2 upgraded from 12.1). We also had some similar issues on our production AWR Warehouse after we upgraded it from 12.1 to 12.2 (a different error, but also related to AWRW partitions). So I suspect some bug where AWRW is not properly handling partitions on new 12.2 AWR tables.

Messing around with the register/un-register functions also reminds me of another thing to be aware of regards my ‘hack’ on the CAW_DBID_MAPPING table. If you subsequently remove the either database from AWRW (via the GUI) it will also remove all records from AWR history from both databases. This may or may not be what you want, you should be aware of it.