Tuesday, October 19, 2010

-- Questions about the Oracle GG Roadmap --

By Oracle Corp.

1.- Will Oracle fix the recyclebin off issue when capturing DDL? If so, what OGG release and what RDBMS versions are expected to support a functioning recyclebin in the source?

We now support the recyclebin on/true for Oracle 11g along with our DDL replication support, Oracle 10g still requires the recyclebin to be off/false.

   2.- Does OGG support active-active in a 3+ node (multi master) configuration? Documentation shows 3 nodes in an active-active configuration, is 3+ supported?  If so, any restrictions for DML or DDL for a 3+ active-active configuration?
  
      Our active-active capabilities apply to any number of nodes in the configuration, in other words, we are not limited to only two. Obviously you need to make special considerations for other factors such as the replication of tables that use sequences, replication of truncate or other DDL operations, for example. For best practices, I recommend for multi-node active-active environments, that user privileges/grants are set so that DDL operations can only originate from one node and not the others. Configure OGG to replicat those DDL operations to all downstream nodes in one direction only. Of course that means that other considerations need to plan in event that the node that allows DDL operations goes down or has a planned downtime for maintenance activities.
·          
      3.- When will OGG support source tables with Advanced Compression?  10.4.x docs says compressed tables are not supported for extract.
     
       It is on the road-map to support advanced compression. In fact, a release defining feature is to integrate our transaction log reading functionality with the XStreamOut API. This integration will give us support for the variety of database compression features. This is scheduled to be part of the next major release. 
     
      4.- Will the SUPPRESSTRIGGERS param available for RDBMS 10.2.0.5 and 11.2.+ for OGG 11.1.x be supported in other RDBMS versions, mainly 10.2.0.4 and all 11G R1 RDBMS versions.
                   
      SUPPRESSTRIGGERS parameter  support is the ultimate goal, but I am not clear which future release this is targeted to be included. Each database has different functionality so it might be a phased approach depending if it is possible and the development effort required to support it.

-- CRS will not start on one node (Fixed on 11.2.0.2 and Workaround below) --


: Hdr: 9469133 11.2.0.1 PCW 11.2.0.1 CSS PRODID-5 PORTID-197
Abstract: CORE DUMP OF OCSSD.BIN WHEN VOTING DISK IS NOT ACTIVATED.

Some explanation from development:
==================================
 In 11.2, voting files are discovered, not hard-wired, so we look through a
 list of files that are specified in the 'discovery string', e.g.
 /dev/vdisk/*, and use all files that appear to be legitimate voting files,
i.e. they have a TOC (Table Of Contents), volume info block, etc.  Since the
VG with the voting files is not online, the discovery does not see them at
 all, so does not consider them as voting files and fails as a result of an
inability to find enough voting files.

=====================================

The workaround is to wait the disks are completely ONLINE after machine boot before cluster is started
(note 459169.1 CRS Does Not Startup Automatically After Node Reboot,
                   Manual Start is OK - but this node is up to 11.1 version)

or apply  patch, backport is available for Solaris x86-64(64 bit) - please confirm the Operating System

please  upload the patch inventory - I may need to raise backport for you

We're getting these errors on the other node when rebooting:

I did see in the boot messages on both hosts these items:
Johnston, Nathaniel [10:39 AM]:
Sep 24 16:05:15 mhddb-nb-2p.philadelphia.pa.bo.comcast.net root: Oracle HA daemon is enabled for autostart.
Johnston, Nathaniel [10:39 AM]:
Sep 24 16:05:18 mhddb-nb-2p.philadelphia.pa.bo.comcast.net unix: vn_rdwr failed with error 0x15
Sep 24 16:05:18 mhddb-nb-2p.philadelphia.pa.bo.comcast.net unix: kobj_load_module: read header failed
Johnston, Nathaniel [10:39 AM]:
Sep 24 16:05:19 mhddb-nb-2p.philadelphia.pa.bo.comcast.net root: exec /u01/app/grid/perl/bin/perl -I/u01/app/grid/perl/lib /u01/app/grid/bin/crswrapexece.pl /u01/app/grid/crs/install/s_crsconfig_mhddb-nb-2p_env.txt /u01/app/grid/bin/ohasd.bin "reboot"
Sep 24 16:05:19 mhddb-nb-2p.philadelphia.pa.bo.comcast.net root: exec /u01/app/grid/perl/bin/perl -I/u01/app/grid/perl/lib /u01/app/grid/bin/crswrap
Johnston, Nathaniel [10:40 AM]:
Sep 24 16:05:36 mhddb-nb-2p.philadelphia.pa.bo.comcast.net mDNSResponder (Engineering Build) (Nov 2 2009 05:02:07) [5272]: starting
Sep 24 16:05:37 mhddb-nb-2p.philadelphia.pa.bo.comcast.net mDNSResponder: Oracle mDNSResponder starting

======================================
                               WORKAROUND
======================================

11gR2 CRS doesn't startup after node reboot [ID 1050164.1]


Modified 31-JAN-2010     Type PROBLEM     Status PUBLISHED

In this Document
  Symptoms
  Changes
  Cause
  Solution




Applies to:

Oracle Server - Enterprise Edition - Version: 11.2.0.1.0 to 11.2.0.1.0 - Release: 11.2 to 11.2
Generic Linux

Symptoms

  • Installation of the 11gR2 Grid Infrastructure on a Linux cluster completed successfully
  • OCR & Voting files located in ASM diskgroup
  • using ASMLIB driver
  • ASM disks are located on multipath devices (/dev/mapper/)
  • following a node reboot CRS does not startup
  • CSS daemon log shows the following message:
    2010-01-13 09:04:15.075: [ CSSD][1150449984]clssnmvDDiscThread: using discovery string for initial discovery
    2010-01-13 09:04:15.075: [ SKGFD][1150449984]Discovery with str::
    2010-01-13 09:04:15.075: [ SKGFD][1150449984]UFS discovery with ::
    2010-01-13 09:04:15.075: [ SKGFD][1150449984]OSS discovery with ::
    2010-01-13 09:04:15.076: [ SKGFD][1150449984]Discovery with asmlib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: str ::
    2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA1:
    2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA2:
    2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA3:
    2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA4:
    2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted)
    2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted)
    2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted)
    2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted)
    2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvDiskVerify: Successful discovery of 0 disks
    2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
    2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvFindInitialConfigs: No voting files found
    2010-01-13 09:04:15.077: [ CSSD][1150449984]###################################
    2010-01-13 09:04:15.077: [ CSSD][1150449984]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread
    2010-01-13 09:04:15.077: [ CSSD][1150449984]###################################
    2010-01-13 09:04:15.077: [ CSSD][1139960128]clssgmClientShutdown: total iocapables 0
    2010-01-13 09:04:15.077: [ CSSD][1139960128]clssgmClientShutdown: graceful shutdown completed.
    2010-01-13 09:04:15.077: [ CSSD][1150449984]
  • running the cluster verification utility returns the following messages:
    /cluvfy stage -post crsinst -n racnode1

    Performing post-checks for cluster services setup
    Checking node reachability...
    Node reachability check passed from node "racnode1"
    Checking user equivalence...
    User equivalence check passed for user "grid"
    Checking time zone consistency...
    Time zone consistency check passed.
    ERROR:
    Cluster manager integrity check failed
    PRVF-5434 : Cannot identify the current CRS software version
    UDev attributes check for OCR locations started...
    UDev attributes check passed for OCR locations
    UDev attributes check for Voting Disk locations started...
    ERROR:
    PRVF-5197 : Failed to retrieve voting disk locations
    UDev attributes check failed for Voting Disk locations
    Default user file creation mask check passed
    Checking cluster integrity...
    Cluster integrity check failed This check did not run on the following node(s):
    racnode1
    Checking OCR integrity...
    Checking the absence of a non-clustered configuration...
    All nodes free of non-clustered, local-only configurations
    ERROR:
    PRVF-5300 : Failed to retrieve active version for CRS on this node
    OCR integrity check failed
    Checking CRS integrity...
    ERROR:
    PRVF-5300 : Failed to retrieve active version for CRS on this node
    CRS integrity check failed
    OCR detected on ASM. Running ACFS Integrity checks...
    Starting check to see if ASM is running on all cluster nodes...
    PRVF-5137 : Failure while checking ASM status on node "racnode1"
    Starting Disk Groups check to see if at least one Disk Group configured...
    PRVF-5112 : An Exception occurred while checking for Disk Groups
    PRVF-5114 : Disk Group check failed. No Disk Groups configured
    Task ACFS Integrity check failed
    Checking Oracle Cluster Voting Disk configuration...
    ERROR:
    PRVF-5434 : Cannot identify the current CRS software version
    PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
    User "grid" is not part of "root" group. Check passed
    Post-check for cluster services setup was unsuccessful on all the nodes.


Changes

Node was rebooted after install.

Cause


The CSS daemon crashes because it cannot locate any Voting files in any of the discovered ASM disks, which is indicated by the following message in the CSS daemon log (/log//cssd/ocssd.log):

2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvFindInitialConfigs: No voting files found


This error is preceded by the following ASMLIB error:
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted)
suggesting that ASMLIB has problem accessing the ASM disk.


Solution

1. either edit the file /etc/sysconfig/oracleasm-_dev_oracleasm    and change the lines:
ORACLEASM_SCANORDER=""
ORACLEASM_SCANEXCLUDE=""
to
ORACLEASM_SCANORDER="dm"
ORACLEASM_SCANEXCLUDE="sd"
or alternatively run the following command (as user root)
/usr/sbin/oracleasm configure -i -e -u user -g group -o "dm" -x "sd"

2. stop & restart ASMLIB as user root using:
/usr/sbin/oracleasm exit
/usr/sbin/oracleasm init

3. restart CRS or reboot node

The above steps need to be executed on all nodes

-- ASM unresponsive on 1 cluster node of 4 (Fixed on 11.2.0.2) --

By Rafael Orta


=======================================================
Bug 9276348  BugDB see Bug 9276348.-P                Base Bug 9254569
Related Bugs : BugMatrix 9276348    DDR: BugDesc 9276348
Customer: INTERNAL - RACSYSTST                             Created: 11-JAN-10
Component: PCW          Comp Ver: 11.2.0.2     Rel St: D   Updated: 03-FEB-10
Sub Comp:              RDBMS Ver: 11.2.0.2                      By: HFLAKE
Status: 96,Closed, Duplicate Bug
Severity: 2,Severe Loss of Service                          Fixed In Ver:
O/S: 226 Linux x86-64
PL Group:   Gen/Port: G  Error #: -  Pub: N
Hdr: 9276348 11.2.0.2 PCW 11.2.0.2 PRODID-5 PORTID-226 9254569
Abstract: LX64: CRS STACK FAILS TO COME UP ON FIRST NODE, CSS STARTS OK
=======================================================

3) Bug 9276348  is closed as duplicated of Bug 9254569  :
=======================================================
Bug 9254569  BugDB see Bug 9254569.-P                No Base Bug
Related Bugs : BugMatrix 9254569    DDR: BugDesc 9254569
Customer: INTERNAL-RACSYSTST                               Created: 04-JAN-10
Component: PCW          Comp Ver: 11.2         Rel St: D   Updated: 13-SEP-10
Sub Comp: CRS          RDBMS Ver: 11.2                          By: ARU
Status: 80,Development to Q/A
Severity: 2,Severe Loss of Service                          Fixed In Ver: 11.2.
0.2
O/S: 233 Microsoft Windows Server 2003 (64-bit
PL Group:   Gen/Port: G  Error #: -  Pub: N

Hdr: 9254569 11.2 PCW 11.2 CRS PRODID-5 PORTID-233
Abstract: MISSING DEPENDENCY IN OHASD: GIPC HA REQUIRES GIPCD
=======================================================

4) Bug 9254569  is fixed on release 11.2.0.2.


ACTION PLAN
===========

5) In order to fix the bug 9254569 please apply the patchset 11.2.0.2 on the Grid Infrastructure Oracle Home. Patchset 11.2.0.2 also contains another very important fixes for the Grid Infrastructure CRS.

6) For additional information please check the next note:

=)> Important Changes to Oracle Database Patch Sets Starting With 11.2.0.2 (Doc ID 1189783.1)

-- Backup Issue encountered after DB failover --


By Raja Ankathi


We encountered the backup issue on  db. As per the metalink note 566635.1, there is an bug in 10.2. and it is fixed in 11.2


==================
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.2.0 This problem can occur on any platform.
Symptoms

Resync of the catalog for the Primary database fails with:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of resync command on default channel at 05/06/2008
10:29:09
ORA-00001: unique constraint (RMAN.TF_P) violated Cause You are hitting bug 6653570

The root cause scenario in the bug is:
1) a primary and standby is created
2) on primary customer added a tempfile. This doesn't get created on standby.
    It gets resynced to recovery catalog
3) Then when switchover happens, they add the tempfile on new primary and
    resync it to recovery catalog. This resync marks the tempfile on primary (at
    step 2) as dropped.
4) Then when the switchover happens again, the temp file that was dropped is
    again resynced which is causing the problem.
Solution

Bug 6653570 is fixed in Oracle Database version 11.2

Please check metalink if the patch is available for your platform and version. If not and a one off is required, please contact Oracle support

The workaround for this issue is to recreate the temporary tablespace on the primary database.

-- 11.2.0.2 upgrade error --


By Gary Oprendek
 
When upgrading grid 11.2.0.1.0 to 11.2.0.2.0, you must make sure that patch 9413827 is installed on the 11.2.0.1.0 grid and database binaries PRIOR to starting the upgrade. The upgrade will continue without it until you get to the end when you should run root.sh and it will bomb out stating that that patch is not installed. In order to install it you will have to shut down the grid and asm home and apply the patch. This will of course mean that you have to kill the upgrade process. This causes a problem because after you apply the patch and restart the upgrade process, the installer states that there is no grid or asm installation on the machine. This is because it has wiped out a parameter in the oracle inventory for the grid installation. It removes the “  CRS=”true”  “ parameter from the end of the line listing the grid home in the ~oraInventory/ContentsXML/inventory.xml file. The entry should look similar to this:

CRS="true">

If you run into this problem then manually edit the file and add the parameter back in and the installer will find the cluster and continue normally.

Wednesday, October 13, 2010

- Changing Private IP Addresses 11G R2 OCR / VD on ASM --

By Bill Toranto

I recently encountered this issue and thought everyone should be aware of it.


If any re IP-ing is to take place in your 11G R2 RAC clusters (mainly the Private IP), the Grid / CRS OCR needs to be updated prior to anything being done on the host. The ASM instance needs to be available in order for the OCR to be accessible.

IF the host is re IP-ed first, the Grid will need to be re installed using the new private IP address, since ASM will not mount & the OCR inaccessible.

Oracle does expect to have a fix for this very soon since they are getting a large number of reported issues related to this, and ASM OCR’s a relatively new implementation.

This may seem fundamental, but putting the OCR on ASM is a fairly new practice for most of us and previously could be modified after the host ip’s were modified since we used RAW devices.

Monday, October 11, 2010

-- Oracle Debuts Unbreakable Enterprise Kernel for Oracle Linux --


I don’t know what is in Larry’s mind, but seen this been announced all around makes me think Oracle is reaffirming their commitment to this Linux based platform, does it makes sense for them to keep the expensive Solaris platform alive or will there be a merge path down the road?, this is just something to think about to position ourselves properly in the future.

"Oracle Debuts Unbreakable Enterprise Kernel for Oracle Linux




Oracle has announced the availability of the Unbreakable Enterprise Kernel for Oracle Linux, a fast, modern, reliable kernel that is optimized for Oracle software and hardware. "Today's hardware innovations are fast and frequent, making it very important that the Linux distributions evolve quickly to leverage the latest hardware," said Wim Coekaerts, senior vice president of Linux and Virtualization Engineering at Oracle.
The new Unbreakable Enterprise Kernel is the result of the combined efforts of Oracle's Linux, database, middleware, and hardware engineering teams, and is:
  • Fast—More than 75 percent performance gain demonstrated in OLTP performance tests over a Red Hat Compatible Kernel; 200 percent speedup of Infiniband messaging; 137 percent faster solid state disk access
  • Modern—Provides optimizations for large NUMA servers; improved power management and energy efficiency; fine-grained CPU and memory resource control
  • Reliable—Supports the Data Integrity Extensions and T10 Protection Information Model, to stop corrupt data from being written to storage; hardware fault management improves application uptime; low overhead performance counters for tracing
  • Optimized for Oracle—Built and tested to run Oracle hardware, databases, and middleware with the best Linux performance and reliability available
"With the combination of Oracle Linux and the Unbreakable Enterprise Kernel, customers gain fast access to optimizations, new enhancements, and bug fixes," says Coekaerts. "This new offering is a result of Oracle Linux kernel development efforts, on top of the current mainline kernel, and further distinguishes Oracle Linux as the best choice for enterprise customers.""

Wednesday, October 6, 2010

-- Real Application Testing for Earlier Releases --

Metalink Note: 560977.1: Real Application Testing for Earlier Releases

What is being announced?

Oracle Database 11g introduced Database Replay and SQL Performance Analyzer as part of the Real Application Testing option to enable businesses identify issues with system changes before production deployment. This note provides information on the Real Application Testing functionality available
for pre-11g database releases. The main goal of making Real Application Testing functionality available in earlier releases is to enable customers to upgrade to higher database releases.

SQL Performance Analyzer allows users to identify SQL performance regressions caused by system changes such as adding an index, gathering optimizer statistics, implementing partitioning, RDBMS upgrades, etc. SQLPerformance Analyzer assesses the impact of change on SQL response times by executing each SQL serially with production context (bind variables, optimizer environment, etc.) before and after a change and then provides a report highlighting any performance divergences.

Database Replay allows users to perform comprehensive testing of database and infrastructure changes using real application workloads. It allows capture of production workload including concurrency, think time, and transactionsdependencies, and allows users to replay the workload on a test system with the exact same production characteristics so that all problems can be identified and remediate in test before deploying the change to production.

Please check the “Licensing Information” manual on OTN for details regarding licensing of Real Application Testing Option.

What do you need to do?

What you need to do to use Real Application Testing functionality in earlier releases?

Real Application Testing functionality for pre-11g database releases is installed using the “opatch” utility and following instructions for the generic and platform specific “README” for the patches. If you want to only use Database Replay or SQL Performance Analyzer, then only the patch for that particular functionality
needs to be applied. If you want to use both Database Replay and SQL Performance Analyzer, the patches for both the functionalities need to be applied.

Database Replay

To help customers upgrade to Oracle Database 11g, the “capture” functionality of Database Replay has been made available in previous releases as shown in thetable below. The replay of the captured workload can only be done on Oracle Database 11g and higher.

Please refer to Oracle documentation on OTN for information on the capture part of Database Replay available for the earlier releases.


Enabling Workload Capture


This step is required for 10.2.0.4.0 only. By default, the workload capture is disabled on the pre-11g database releases. Database workload capture functionality is enabled on the system by specifying
the PRE_11G_ENABLE_CAPTURE initialization parameter. To enable workload capture, run the wrrenbl.sql script at the SQL prompt as sys or system: This step is NOT required for version lower then 10.2.0.4.0.

@$ORACLE_HOME/rdbms/admin/wrrenbl.sql

The wrrenbl.sql script calls the ALTER SYSTEM SQL statement to set the PRE_11G_ENABLE_CAPTURE initialization parameter to TRUE. Please check the OTN documentation mentioned above for more details.

SQL Performance Analyzer

SQL Performance Analyzer functionality has been enhanced and made available in earlier releases to help customers upgrade their databases from versions Oracle 9.x, 10.1.x, and 10.2.x to higher release. Applying the one-off patches as shown below actually installs the SQL Performance Analyzer functionality for that
release. After the functionality is installed, customers can then leverage Oracle Database 11g SQL Performance Analyzer functionality to upgrade from any database releases 9.x, 10.1.x, 10.2.x to 10.2 or higher release. The one-off patches that need to be applied depend on the version of the source and the
destination databases as given in the table below.

Further information on this topic is available in the technical white paper “Oracle Real Application Testing: Testing the SQL Performance Impact of Oracle 9i/10g Release 1 to Oracle Database 10g Release 2 upgrade with SQL Performance Analyzer”. It is recommended to review this document before using SQL Performance Analyzer functionality for earlier releases.

Patch 6865809 should be applied on the Oracle Database Release 11.1.0.6 test database that is used to orchestrate SPA on Oracle Database 10.2. The patch enables Oracle 11g test database to consume SQL trace files from previous database releases and generate a SQL Tuning Set (STS). The STS can then be
used as input for the SQL Performance Analyzer task.

Patches applied on Oracle Database 10.2 source or destination databases enable “Remote SQL Test Execution” functionality on that release for more accurate analysis of performance data.

Table 3: SQL Performance Analyzer Availability Information. For Windows
platform, see additional notes below:





Note for Windows Platforms:


* For Windows 32-bit Platform patch number on top of 11.1.0.6.0 database is 7044721.
* For Windows 64-bit AMD64 patch number on top of 11.1.0.6.0 database is 7044728.

Note for 11.1.0.7.0 Database

If you're using 11.1.0.7.0 database for SPA testing, you do not need to apply any of the above-mentioned patches on the 11g SPA system that is used for analysis. All the fixes are already in there. On 10.X database side, you do need the patch as mentioned above.

Who to contact for more information?

Please create a Service Request with Oracle Support

References

Note 466181.1 - 10g Upgrade Companion
Note 562899.1 - TESTING SQL PERFORMANCE IMPACT OF AN ORACLE 9i
TO ORACLE DATABASE 10g RELEASE 2 UPGRADE WITH SQL
PERFORMANCE ANALYZER

Note 601807.1 - Oracle 11g Upgrade Companion

Keywords

SQL~PERFORMANCE~ANALYZER ; UPGRADE ; SQL~TUNING ; UPGRADE~
TO~10.2.0.3.0 ; MIGRATE~DATABASE ; PATCH~UPGRADE~TO10.sp;SQL~T
UNING~ADVISOR ; TUNING~PACK ; UPGRADE~FROM~9.2.0 ; UPGRADE~T
O~9.2.0 ;

-- Failed login attempts updates the account status column as LOCKED in 11.2.0.1 --

There is a known bug under 11.2.0.1 that displays the incorrect account status when a user has accidently locked an account via too many invalid passwords attempts. In a nutshell, too many password attempts should result in a “LOCKED(TIMED)” status, but instead result in a “LOCKED” status. The “LOCKED” status is what an account is set to when a DBA intentionally locks it (i.e. when somebody leaves the company or team).


There is a work around which would require updating all profiles to ensure password_lock_time is set to a number, rather than unlimited. Alternatively we can apply the 11.2.0.2 patchset.