Tuesday, August 26, 2008

Oracle Streams – The ABC’s of a Heartbeat table

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

NOTE: The following posting was in collaboration with Frank Ortega, Thanks Frank for reviewing my work.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------


What is a heartbeat table and what is it used for?

This table is especially useful for databases that experience periods of low replication activity. The use of a heartbeat table ensures the replication environment is working even if there are just a few replicated changes. It also is valuable as a quick check for the DBA to ensure that the Streams environment is functioning properly (by simply querying a date column in the heartbeat table on the destination).

By default, an Oracle Streams capture process requests a Streams checkpoint after every 10MB of generated redo. The Oracle Streams metadata is only maintained during that checkpoint if the transactions have data relative to Streams capture. Implementing a heartbeat table ensures that there are open active ‘Streams transactions’ occurring regularly in the source database, insuring Streams metadata is updated during each checkpoint.


Ultimately, the end result of this is that CAPTURED_SCN in the Streams metadata (DBA_CAPTURE), will be maintained and stay current with redo that is being generated even during typical periods of low replication activity. This will ensure that the Capture process will not fail (missing archive logs for an old SCN required for capture) during Oracle Startup following low activity periods.


What is involved in doing this?

A.- You will need to create a table in your source database that has as one of the columns a date/timestamp column. Enable supplemental logging for the table and instantiate the table at the source. Export the table from the source and Import at the destinations with the instantiation option.

B.- Add a rule to capture changes for the heartbeat table at the source. Propagate the changes to the destination.


C.- Make sure that the target destination will apply changes to this
table as well by adding a new apply rule.

D.- Set up an automated job to update this table at the source site
periodically.



Let's do it

Assumptions: A Streams environment already exists. Capture Process (CAPTURE_STREAM) on source and Apply Process (APPLY_STREAM ) on destinations exist. Queues (STREAMS_QUEUE) already exist on both source and destinations. Table level replication is occurring. STREAMADM is streams administrator account on source and destination. Capture and Apply processes are stopped prior to adding rules and restarted when you are done.


(A)

On the source:

SQL> create table heartbeat (a number primary key, b date);

SQL> alter table heartbeat enable supplemental log data (all) columns;

Using userid

SQL> exec dbms_capture_adm.prepare_table_instantiation

(table_name => '.HEARTBEAT');

exp USERID=/ FILE=strmuser.dmp TABLE=.HEARTBEAT OBJECT_CONSISTENT=y ROWS=y

On the destinations:

imp USERID=/ FILE=strmuser.dmp TABLE=.HEARTBEAT COMMIT=y LOG=import.log STREAMS_INSTANTIATION=y GRANTS=n

(B)

On the source:

BEGIN

DBMS_STREAMS_ADM.ADD_TABLE_RULES(

table_name => '.heartbeat',

streams_type => 'CAPTURE',

streams_name => 'CAPTURE_STREAM',

queue_name => ‘STREAMADM.STREAMS_QUEUE',

include_dml => true,

include_ddl => false,

source_database => 'SOURCEDB');

END;

/

BEGIN

DBMS_STREAMS_ADM.ADD_TABLE_PROPAGATION_RULES(

table_name => '.heartbeat',

streams_name => ‘CAPTURE_STREAM,

source_queue_name => 'STREAMADM.STREAMS_QUEUE',

destination_queue_name => ' STREAMADM.STREAMS_QUEUE@DESTDB’,

include_dml => true,

include_ddl => false,

source_database => 'SOURCEDB');

END;

/

(C)

On the destinations:

BEGIN

DBMS_STREAMS_ADM.ADD_TABLE_RULES(

table_name => '.heartbeat',

streams_type => 'APPLY',

streams_name => 'APPLY_STREAM',

queue_name => 'STREAMADM.STREAMS_QUEUE',

include_dml => false,

include_ddl => true,

source_database => 'SOURCEDB’);

END;

/

(D)

On the source:

SQL> create sequence temp_seq start with 1;

SQL> variable jobno number;
begin
dbms_job.submit(:jobno, 'insert into heartbeat

values (temp_seq.nextval, sysdate);',
sysdate, 'sysdate+60/(60*60*24)');
commit;
end;
/


Monday, May 5, 2008

The new Oracle scheduler and how it works

Traditionally you used the dbms_job.submit to create a job (before Oracle 9i) , and you summited as parameters , the job number, what you want to be run, the start date and the interval, simple but if you ask me not much flexibility.

Typical example: exec dbms_job.submit(Jobno, 'begin (procedure_name); end; ' , SYSDATE, 'SYSDATE + 36/86400'); (Every 36 seconds)

In 10g and above Oracle provides greater flexibility and in my opinion clarity, it used to be a pain trying to come up with the right interval formulas for odds intervals. (for example 3/1440 would be execute every 3 minutes, but at simple view that is not that clear).

Now in 10g Oracle splits the process in 3 to provide flexibility, first you create a schedule, next you create a program and then you create a job and assign that schedule and program for the job. This allows you to reuse the schedule or the programs with another jobs.

NOTE: The examples below are for illustration purpose only and in most of the cases you need to complete and adapt the syntax to what you want to do.

Example of creating a schedule:

Begin
sys.dbms_scheduler.create_schedule(
repeat_interval => 'FREQ=WEEKLY;BDAY=WED;BYHOUR=8;BYMINUTE=0;BYSECOND=0',
start_date => to_timestamp_tz('2008-05-06 US/Eastern','YYYY-MM-DD TZR'),
comments => 'Wednesday AM Schedule',
schedule_name => '"RORTA"."WED_AM"');
end;
/

The schedule above is created on the rorta schema with the name WED_AM to be executed every Wednesday at 8 AM.

Creating a program:

You need first to create the program with the dbms_scheduler.create_program

begin
dbms_scheduler.create_program (
program_name => 'RORTA.WEEKLY_CHECK'
,program_type => 'STORED PROCEDURE'
,program_action => 'package.procedure'
,number_of_arguments => 1
,enabled => FALSE
,comments => 'comments');

You can pass arguments to your programs, see below

dbms_scheduler.define_program_argument (
program_name => 'RORTA.WEEKLY_CHECK'
,argument_position => 1
,argument_name => 'kol1'
,argument_type => 'VARCHAR2'
,default_value => 'default'
);

Now you enable the program

exec dbms_scheduler.enable(name => 'RORTA.WEEKLY_CHECK');

Creating the job:

A job is a combination of the schedule and a program. look at the example below:

begin
sys.dbms_scheduler.create_job(
job_name => 'RORTA.WEEKLY_CHECK',
program_name => 'RORTA.WEEKLY_CHECK',
schedule_name => 'RORTA.WED_AM'
comments => "Check the database Health",
auto_drop => FALSE,
enabled => TRUE);
end;
/

In addition your job can execute an external executable as follow:

begin
dbms_scheduler.create_job ( job_name => 'RUN_SHELL',
schedule_name => 'SHELL_SCHEDULE',
job_type => 'EXECUTABLE',
job_action => '/home/oracle/script.sh',
enabled => true,
comments => 'Shell-script' );
end;
/

Use the following views and commands to monitor your jobs

- dba_scheduler_job_run_details
- dba_scheduler_running_jobs
- dba_scheduler_job_log
- show all schedules
- dba_scheduler_schedules
- dba_scheduler_jobs
- dba_scheduler_programs
- dba_scheduile_programs_args

Wednesday, April 30, 2008

How to send html emails from Linux / Unix using mutt

Hello Everybody

We recently created a new database health check for one of our customer as part of the DMDRS (Database Monitor Diagnostic & Resolution Services) service, it is a Sqlplus HTML report that is executed automatically through a cron job and then mailed automatically. The challenge was to display the email on the client as an html page. As usual I did my research on the internet and I found tons answers, most of them involved some amount of work and new software install. In general IT people don't link HTML emails because it posses a security risk and consumes several times the size of a regular email. In my case it was a marketing decision, the president of this company wants to see this email and I wanted to impress him, so HTML does the trick. Yes I could send the HTML report as an attachment but I want to make it easier for my customer (one click instead of 2 clicks).

I was getting crazy trying different options there is plenty of information on the internet however none of the answers worked for me. I did send an email out to my network asking them if they have done this before and to my surprise I got no answer. However one of my friends Michael Monostra after playing with it got it right, so kudos to Mike for his help and support.

Before getting to it, I consider important explaining a concept. What is MIME about?, we all hear it before we have seen the word in some of our emails but how many of you really know what it is?.

MIME or Multipurpose Internet Mail Extension in a nutshell is a specification for the format of a non-text email. The main characteristic is the presence of the MIME header, it tells your MIME compatible browser or email client that the message is not text, so it interprets your message in a different way.

So here is the way to make it work, we are going to use "mutt" command lie to send the email, I found that mutt is available in most of the Unix and Linux versions and installed by default.

1.- In your $HOME/.muttrc (initialization file for mutt) add the following lines (this file is likely to not exist, just create it).

set hostname = Your hostname
set realname = "The senders name"

2.- You need to check that you have a mime type like "text/html" associated with the extensions .html and .htm in either of this files /etc/mime.types and $HOME/.mime.types

3.- This is the actual command

mutt -e 'my_hdr Content-Type:text/html' -s "subject for the email" < filename.html


NOTE: During my tests I found that some providers won't allow you to send emails from dynamic IP address by any other method other than using their SMTP server, so I was not able to send the email from my home using Comcast ISP and a non business account. Some email servers will not deliver this emails and bounce it back because they do reverse lookup base on the hostname of the sender and if that is not setup properly it will be block. During my test I found that Yahoo and Gmail will not care about it and will receive the emails.

I hope this is helpful to you.

Rafael Orta

Saturday, March 22, 2008

To my incredible smart nephew Alex Sugar

Alex, It never stops amazing me how smart you are, you have a brilliant career in front of you, go for it champ. Here are the answer to some of your questions in case you forget. You can always write me to Rafael.Orta@oraprofessionals.com with any other question you may have.

To my readers: Alex is a fifth grader, eleven years old kid on his mid way to be a Visual Basic programmer; hopefully I can motivate him to get into the Java / Oracle duo. Alex has in mind to be a game programmer, no doubt he will reach his goal!


Integer: The integers are the set of numbers consisting of the natural numbers and their negatives. They are numbers that can be written without a fractional or decimal component, and fall within the set {... −2, −1, 0, 1, 2, ...}. For example, 65, 7, and −756 are integers; 1.6 and 1½ are not integers. In other terms, integers are the numbers you can count with items such as apples or your fingers, and their negatives, including 0.

Short Integer: a short integer is a data type that can represent a positive or negative whole number whose range is less than or equal to that of a standard integer on the same machine. Although there is no global standard, it is common for a short integer to either be exactly half the size, or the same size as a standard integer (in the same context). In this latter case, use of the word 'short' is technically redundant, but may be used to indicate that it is not a long integer.

A variable defined as a short integer in one programming language may be different in size to a similarly defined variable in another. In some languages this size is fixed across platforms, while in others it is machine dependent. In some languages this data type does not exist at all.


Long Integer: is a data type that can represent a positive or negative whole number whose range is greater than or equal to that of a standard integer on the same machine.

In practice it is usual for a long integer to require double the storage capacity of a standard integer, although this is not always the case.

A variable defined as a long integer in one programming language may be different in size to a similarly defined variable in another. In some languages this size is fixed across platforms, in others it is machine dependent. In some languages this data type does not exist at all.

Real: This is a data type used by computers programs to represent an approximation to a real number, because real numbers are not countable computers cannot represent them exactly using a finite amount of information. Most often the computer uses a reasonable approximation.

Blobs: A binary large object, also known as a blob, is a collection of binary data stored as a single entity in a database management system. Blobs are typically images, audio or other multimedia objects, though sometimes binary executable code is stored as a blob. Database support for blobs is not universal.

The data type and definition was introduced to describe data not originally defined in traditional computer database systems but became possible when disk space became cheap.

Fixed point Arithmetic:
It is as well a real (numbers with decimals) data type for a number that has a fixed numbers of digits after the decimal point (and sometimes also before). They are useful for representing fractional values usually on base 2 (binary) or base 10. See my example below for more details.


Floating point: To make it easy for you let's say that floating point is the way the computer represent numbers that contain decimals. In other words floating point describes a numerical representation system in which a string of digits represents a real number.

The advantage of floating-point representation over fixed point (or integer) representation is that it can support a much wider range of values. For example, a fixed-point representation that has eight decimal digits, with the decimal point assumed to be positioned after the sixth digit, can represent the numbers 123456.78, 8765.43, 123.00, and so on, whereas a floating-point representation with eight decimal digits could also represent 1.2345678, 1234567.8, 0.000012345678, 12345678000000000, and so on.


Single Precision Floating Point: Single-precision values can contain decimal points and have a range of +/- 8.43*10^-37 to 3.40*10^38.

While Single-precision numbers can represent both enormous and microscopic values, they are limited to six digits of precision. In other words, Single-precision does a good job with figures like $451.21 and $6,411.92, but $671,421.22 cannot be represented exactly because it contains too many digits. Neither can 234.56789 or 0.00123456789. A Single-precision representation will come as close as it can in six digits: $671,421, or 234.568, or 0.00123457. Depending on your application, this rounding off can be a trivial or crippling deficiency.

Double Precision Floating Point: Double-precision floating-point numbers are to Single precision numbers what Long integers are to Integers They take twice as much space in memory (8 bytes versus 4 bytes), but have a greater range (+/- 4.19*10^-307 to 1.79*10^308) and a greater accuracy (15 to 16 digits of precision versus the 6 digits of Single-precision). A Double-precision, 5,000-element array requires 40,000 bytes. An Integer array with the same number of elements occupies only 10,000 bytes.

Short floating-point: is of the representation of smallest fixed precision provided by an implementation.

Long floating-point number: is of the representation of the largest fixed precision provided by an implementation.

Intermediate:
between short and long formats are two others, arbitrarily called single and double.

The precise definition of these categories is implementation-dependent. However, the rough intent is that short floating-point numbers be precise to at least four decimal places (but also have a space-efficient representation); single floating-point numbers, to at least seven decimal places; and double floating-point numbers, to at least fourteen decimal places.

Source: different websites over the internet but mostly Wikipedia.

Thursday, March 6, 2008

EZ Guide to Oracle Data Pump

Oracle Data Pump (DP) was introduced in 10g to replace the legendary export / import utility (Still available). The Data Pump although takes more time to start is more flexible and manageable and provides greater performance.

Data Pump consist of the following components:

Data Pump API: DBMS_DATAPUMP (creates and monitors DP Jobs).
Metadata API : DBMS_METADATA (Provides objects definitions to DP).
Client Tool: expdp and impdp (Makes calls to DBMS_DATAPUMP).
Data movement API: Uses direct path API (DPAPI) to move data when possible.

The first you need to do is to setup a location (Oracle Directory) where the DP files will be stored.

SQL> create directory dumplocation as '/oradata/dumpfiles';
SQL> grant read, write on directory dumplocation to scott;

In addition a default directory can be created for data pump operations in the database as follows (once created privileged users with EXP_FULL_DATABASE or IMP_FULL_DATABASE privilege do not need to specify a directory):

SQL> create directory DATA_PUMP_DIR as '/oradata/dumpfiles';

Note: The name of the directory must be DATA_PUMP_DIR.

There are 5 different types of DP import and exports:

  1. Database (The default performed by specifying the FULL=Y parameter)
  2. Tablespaces (Performed by specifying the TABLESPACES parameter)
  3. Schema (Performed by specifying the SCHEMAS parameter)
  4. Table (Performed by specifying the TABLES parameter)
  5. Transportable Tablespace (Performed by specifying the TRANSPORT_TABLESPACES parameter)
EXPDP:

$ expdp directory=dumplocation
..
..
Username: scott/tiger
..
..
Job "SCOTT"."SYS_EXPORT_SCHEMA_01" successfully completed at 14:49
$

Since we did not specify any other parameters, the expdp used default values for the file names (expdat.dmp, export.log), did schema-level export (login schema), calculated job estimation using blocks method, used default job name (SYS_EXPORT_SCHEMA_01), and exported both data and metadata.

List of the expdp parameters


Keyword Description (Default)
------------------------------------------------------------------------------
ATTACH Attach to existing job, e.g. ATTACH [=job name].
COMPRESSION Reduce size of dumpfile contents where valid
keyword values are: (METADATA_ONLY) and NONE.
CONTENT Specifies data to unload where the valid keywords are:
(ALL), DATA_ONLY, and METADATA_ONLY.
DIRECTORY Directory object to be used for dumpfiles and logfiles.
DUMPFILE List of destination dump files (expdat.dmp),
e.g. DUMPFILE=scott1.dmp, scott2.dmp, dmpdir:scott3.dmp.
ENCRYPTION_PASSWORD Password key for creating encrypted column data.
ESTIMATE Calculate job estimates where the valid keywords are:
(BLOCKS) and STATISTICS.
ESTIMATE_ONLY Calculate job estimates without performing the export.
EXCLUDE Exclude specific object types, e.g. EXCLUDE=TABLE:EMP.
FILESIZE Specify the size of each dumpfile in units of bytes.
FLASHBACK_SCN SCN used to set session snapshot back to.
FLASHBACK_TIME Time used to get the SCN closest to the specified time.
FULL Export entire database (N).
HELP Display Help messages (N).
INCLUDE Include specific object types, e.g. INCLUDE=TABLE_DATA.
JOB_NAME Name of export job to create.
LOGFILE Log file name (export.log).
NETWORK_LINK Name of remote database link to the source system.
NOLOGFILE Do not write logfile (N).
PARALLEL Change the number of active workers for current job.
PARFILE Specify parameter file.
QUERY Predicate clause used to export a subset of a table.
SAMPLE Percentage of data to be exported;
SCHEMAS List of schemas to export (login schema).
STATUS Frequency (secs) job status is to be monitored where
the default (0) will show new status when available.
TABLES Identifies a list of tables to export - one schema only.
TABLESPACES Identifies a list of tablespaces to export.
TRANSPORT_FULL_CHECK Verify storage segments of all tables (N).
TRANSPORT_TABLESPACES List of tablespaces from which metadata will be unloaded.
VERSION Version of objects to export where valid keywords are:
(COMPATIBLE), LATEST, or any valid database version.

The following commands are valid while in interactive mode.
Note: abbreviations are allowed

Command Description
------------------------------------------------------------------------------
ADD_FILE Add dumpfile to dumpfile set.
CONTINUE_CLIENT Return to logging mode. Job will be re-started if idle.
EXIT_CLIENT Quit client session and leave job running.
FILESIZE Default filesize (bytes) for subsequent ADD_FILE commands.
HELP Summarize interactive commands.
KILL_JOB Detach and delete job.
PARALLEL Change the number of active workers for current job.
PARALLEL=.
START_JOB Start/resume current job.
STATUS Frequency (secs) job status is to be monitored where
the default (0) will show new status when available.
STATUS[=interval]
STOP_JOB Orderly shutdown of job execution and exits the client.
STOP_JOB=IMMEDIATE performs an immediate shutdown of the
Data Pump job.

IMPDP

$ impdp dumpfile=expdat.dmp
..
username: scott/tiger
..
Job "SCOTT"."SYS_IMPORT_FULL_01" successfully completed at 07:09

List of impdp Parameters

Keyword Description (Default)
------------------------------------------------------------------------------
ATTACH Attach to existing job, e.g. ATTACH [=job name].
CONTENT Specifies data to load where the valid keywords are:
(ALL), DATA_ONLY, and METADATA_ONLY.
DIRECTORY Directory object to be used for dump, log, and sql files.
DUMPFILE List of dumpfiles to import from (expdat.dmp),
e.g. DUMPFILE=scott1.dmp, scott2.dmp, dmpdir:scott3.dmp.
ENCRYPTION_PASSWORD Password key for accessing encrypted column data.
This parameter is not valid for network import jobs.
ESTIMATE Calculate job estimates where the valid keywords are:
(BLOCKS) and STATISTICS.
EXCLUDE Exclude specific object types, e.g. EXCLUDE=TABLE:EMP.
FLASHBACK_SCN SCN used to set session snapshot back to.
FLASHBACK_TIME Time used to get the SCN closest to the specified time.
FULL Import everything from source (Y).
HELP Display help messages (N).
INCLUDE Include specific object types, e.g. INCLUDE=TABLE_DATA.
JOB_NAME Name of import job to create.
LOGFILE Log file name (import.log).
NETWORK_LINK Name of remote database link to the source system.
NOLOGFILE Do not write logfile.
PARALLEL Change the number of active workers for current job.
PARFILE Specify parameter file.
QUERY Predicate clause used to import a subset of a table.
REMAP_DATAFILE Redefine datafile references in all DDL statements.
REMAP_SCHEMA Objects from one schema are loaded into another schema.
REMAP_TABLESPACE Tablespace object are remapped to another tablespace.
REUSE_DATAFILES Tablespace will be initialized if it already exists (N).
SCHEMAS List of schemas to import.
SKIP_UNUSABLE_INDEXES Skip indexes that were set to the Index Unusable state.
SQLFILE Write all the SQL DDL to a specified file.
STATUS Frequency (secs) job status is to be monitored where
the default (0) will show new status when available.
STREAMS_CONFIGURATION Enable the loading of Streams metadata
TABLE_EXISTS_ACTION Action to take if imported object already exists.
Valid keywords: (SKIP), APPEND, REPLACE and TRUNCATE.
TABLES Identifies a list of tables to import.
TABLESPACES Identifies a list of tablespaces to import.
TRANSFORM Metadata transform to apply to applicable objects.
Valid transform keywords: SEGMENT_ATTRIBUTES, STORAGE
OID, and PCTSPACE.
TRANSPORT_DATAFILES List of datafiles to be imported by transportable mode.
TRANSPORT_FULL_CHECK Verify storage segments of all tables (N).
TRANSPORT_TABLESPACES List of tablespaces from which metadata will be loaded.
Only valid in NETWORK_LINK mode import operations.
VERSION Version of objects to export where valid keywords are:
(COMPATIBLE), LATEST, or any valid database version.
Only valid for NETWORK_LINK and SQLFILE.

Table Exports/Imports

The TABLES parameter is used to specify the tables that are to be exported. The following is an example of the table export and import syntax:
expdp scott/tiger@db10g tables=EMP,DEPT directory=TEST_DIR dumpfile=EMP_DEPT.dmp logfile=expdpEMP_DEPT.log

impdp scott/tiger@db10g tables=EMP,DEPT directory=TEST_DIR dumpfile=EMP_DEPT.dmp logfile=impdpEMP_DEPT.log

The TABLE_EXISTS_ACTION=APPEND parameter allows data to be imported into existing tables.

Schema Exports/Imports

The OWNER parameter of exp has been replaced by the SCHEMAS parameter which is used to specify the schemas to be exported. The following is an example of the schema export and import syntax:
expdp scott/tiger@db10g schemas=SCOTT directory=TEST_DIR dumpfile=SCOTT.dmp logfile=expdpSCOTT.log

impdp scott/tiger@db10g schemas=SCOTT directory=TEST_DIR dumpfile=SCOTT.dmp logfile=impdpSCOTT.log

Database Exports/Imports

The FULL parameter indicates that a complete database export is required. The following is an example of the full database export and import syntax:
expdp system/password@db10g full=Y directory=TEST_DIR dumpfile=DB10G.dmp logfile=expdpDB10G.log

impdp system/password@db10g full=Y directory=TEST_DIR dumpfile=DB10G.dmp logfile=impdpDB10G.log

I was debating if adding this or not to this document because this suppose to be an EZ guide, but since these are my 2 favorite Data Pump features I decided to add it.

Few years ago you needed to wait for a long import / export to finish before going home and logout of you account or turn off your desktop. Or simple you got an alectric failure while the import was taking place and now you have to clean up and start all over (I have been there and done that, it is very frustrating).

Attach and Detach from a session

The first you need to know is that the core of the DP import or export operation is taking place using a database job, so it does not really depend of your client connection, cool isn't it?. In
other words your job will continue to run even if the client disconnects.

DP works in 2 modes, logging similar to the old import / export and interactive mode. To enter into the interactive mode just control-C while the import or export is taking place.

In the example below I pressed Control-C as soon as the export started.

oracle@mtl01dbatlora03:/opt/oracle/product/10.2.0.1/rdbms/log> expdp system/password

Export: Release 10.2.0.3.0 - Production on Friday, 07 March, 2008 11:31:24

Copyright (c) 2003, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Release 10.2.0.3.0 - Production
Starting "SYSTEM"."SYS_EXPORT_SCHEMA_01": system/********
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

Export> status

Job: SYS_EXPORT_SCHEMA_01
Operation: EXPORT
Mode: SCHEMA
State: EXECUTING
Bytes Processed: 0
Current Parallelism: 1
Job Error Count: 0
Dump File: /opt/oracle/product/10.2.0.1/rdbms/log/expdat.dmp
bytes written: 4,096

Worker 1 Status:
State: EXECUTING
Object Schema: SYSTEM
Object Name: TEMPLATE$_TARGETS_S
Object Type: SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Completed Objects: 18
Total Objects: 18
Worker Parallelism: 1

Export> exit_client

Then to re-attach all I have to do is

expdp system/password attach=SYS_EXPORT_SCHEMA_01

export>

and I will be on my way, if I decide to continue in logging mode all I next to do is

export> continue_client

The other cool feature of Data Pump is that in most of the cases you can stop the job do anything you forgot in the database (like creating a specific tablespace) and then re-start it. ``

Below is a table of some of the commands you can execute while in interactive mode.








Wednesday, January 30, 2008

PHOUG Presentation by Tim Gorman

I had the opportunity tonight of assisted to the PHOUG meeting which included a presentation by Tim Gorman about "Scaling to Infinity: Partitioning Data Warehousing in Oracle".

I have to be honest this is the first time I hear about Tim, but he really surprised me, what a great presentation and presenter. I can see Tim have a lot of Data Warehousing experience and enjoy sharing his knowledge. I will try to summarize below the points that Tim covered.

Tim sees 3 Mayor errors on DW design.

A) Ignore the Basic Requirements for DW and design what is familiar
B) Fail to portray data changes over time.
C) Fail to Utilize partitioning from the beginning.

- Reporting and analysis applications are responsible for presenting data in the format that works best for end users and their query / analysis tools (very ofter what end users want is a simple spreadsheet, hundreds of columns wide.

- Reporting and analysis applications do not enforce business rules.

- Do not build a data model to enforce referential integrity and / or business rules. Normalization is intended for use in process oriented operational systems and not in a DW.

- Provide very simple data representation, with one degree of normalization for flexibility.

Time-variant data, who cares?

There are two major types of queries from business intelligence appliances to the data warehouse databases. Point in time (What is the present situation) and trend analysis (How things looks now versus 3 months ago). Dimension tables are usually designed to be point int time while fact tables are more trend analysis.

Every Data Warehouse has at least one slowly-changing dimension usually involving people.

The Virtuous Cycle

Using EXCHANGE PARTITION for loads enables:
- Elimination of ETL load windows and 24x7 availability for queries.
- Direct path loads
- Bitmap indexes and bitmap-join indexes.

Exchange partition is the basic technique of a bulk-loading new data into a temporary "Load table" which is then indexed , analyzed and then "published" all at once to the end user using the EXCHANGE PARTITION operation.

Final Recommendations

1.- Use dimensional data models for the "presentation" to end users.
2.- Base the database design on time-variant data structures.
3.- Use partitioning

Friday, January 18, 2008

Notes from Oracle RAC Customer Summit

Hello Everybody

I had the opportunity this week of assisting to a Real Application Cluster Customer Summit in the new Oracle office of Bridgewater NJ. In the meeting was one of the product managers for Real application Cluster, members of he RACPACK and representation from Oracle Consulting.

I was a very interesting meeting where Oracle was basically asking their non RAC customers what can we do for you to start using it, and to the current customers what can we do for you be more comfortable using it.

They did emphasis on virtualization and the hardware becoming cheaper, the different ways that you can virtualize (server split (VMWARE) or server aggregations (GRID)). They went through the advantages and disadvantages of each one, in summary server split does not offer business continuity, scalability across servers, performance (high overhead) and reduction in the management burden and with the prices of the hardware and they becoming more powerful it almost a no brainier to use grid.

However Server split virtualization has its niche, It is good for systems with smaller workloads, test, development environments and non critical applications where you can justify the price tag involved with using RAC. while GRID works best for larger workloads and business critical applications.

Grid aims to provide dynamic resource provisioning and automation, For Oracle grid is evolving, the first version of grid (GRID 1.0) is the typical 2 nodes cluster database, once their customers start feeling comfortable with the and it's potential they wan to leverage the full potential of grid and evolve into what they call GRID 2.0. Oracle is looking into incorporate to it policy based service level automation.

In the meeting Oracle also spoke about ASM and Clusterware giving us some tips:

- They say that SAN based redundancy is commonly used but when that is not available ASM is a good option.
- They clarified that contrary to what people think ASM stripping in conjunction with SAN stripping does not hurt performance.
- They said there is a benefit in using ASM redundancy when using extended clusters (Clusters separated geographically).
- When using ASM 2 different LUNS should not map back to the same spindle.
- LUN provisioning should be done in a complimentary manner to ASM.
- Redundancy for private network is highly recommended, most customers use a pair of dedicated switches for it.
- Jumbo frames can be used to reduce the private network traffic and CPU overhead (1500 bps to 9k).
- They recommend to use workload management by service and to partition similar workloads by service.




Tuesday, January 15, 2008

Some Oracle 10g Tuning tips


Some Oracle 10g Tuning tips

AWR

In Oracle10g the Automatic Workload repository is the primary component of Common Manageability Infrastructure (CMI). It is the successor for the Statspack. While the Statspack is still available with 10g, just a few improvements are planned to it; therefore AWR is the preferred mechanism for diagnosing database problems.

AWR provides data used by other CMI components, such as system generated alerts and the advisors. The AWR consist of 2 main components:

- In-memory statistics

- Repository snapshot.

In-memory statistics are gathered once a second on active sessions. They are not written to the database and are aged out of memory as new statistics are generated. In memory statistics are accessed through the view V$ACTIVE_SESSION_HISTORY which queries the ASH (Active Session History) circular buffer area of the SGA (System Global Area)

AWR relies on MMON which wakes up every hour and statistics collection into the repository snapshot (you can modify the collection interval), AWR snapshot provides a persistent view of database statistics.

A script is provided ($ORACLE_HOME/rdbms/admin/awrrpt.sql) to generate a report using repository snapshot. There is also an awrrpti.sql report as well, which has essentially the same output but allows you to define and report on a specific instance.

ADDM

The Automatic Database Diagnostic Monitor (ADDM) uses the snapshot information of AWR to automatically identify performance problems and make recommendations to correct them. ADDM is the predecessor of the Oracle expert tool. ADDM functionality is embedded into the kernel making performance impact near negligible. ADDM analysis is performed every time an AWR snapshot is taken. in addition ADDM can automatically fix certain problems.

To allow use of ADDM, a PL/SQL interface called DBMS_ADVISOR has been provided. This interface may be called directly through the supplied script $ORACLE_HOME/rdbms/admin/addmrpt.sql. Besides this PL/SQL package a number of views allow you to retrieve the results of any actions performed with the DBMS_ADVISOR API.

When the parameter STATISTICS_LEVEL is set to TYPICAL or ALL, the database will automatically schedule the AWR to be populated every 60 min. To create snapshots out of this interval you need to use the dbms_workload_repository.create_snapshot () package.

To be useful in diagnosing a particular problem, the snapshots needs to be created before and after the situation you wish to examine.

The following views are available for querying ADDM information.

View

Description

DBA_ADVISOR_ACTIONS

Actions associated with all recommendations

DBA_ADVISOR_COMMANDS

Commands used by advisor in DB for specifying recommended actions

DBA_ADVISOR_DEF_PARAMETERS

Parameters and default values for all tasks in the database.

DBA_ADVISOR_DEFINITIONS

Properties of all the advisors in the database

DBA_ADVISOR_DIRECTIVES

Not documented

DBA_ADVISOR_FINDINGS

Findings discovered by all advisors

DBA_ADVISOR_JOURNAL

Journal entries for all tasks

DBA_ADVISOR_LOG

Current state of all tasks along with execution specific data such as progress monitoring and completion status

DBA_ADVISOR_OBJECT_TYPES

Object types used by all advisors

DBA_ADVISOR_OBJECTS

Objects currently referenced by all advisors

DBA_ADVISOR_PARAMETERS

Parameters and their current value for all tasks

DBA_ADVISOR_PARAMETERS_PROJ

Not documented

DBA_ADVISOR_RATIONALE

Rationales for all recommendations

DBA_ADVISOR_RECOMMENDATION

Result of complemented diagnostics tasks with action recommendation for the problems identified in each run

DBA_ADVISOR_SQLA_REC_SUM

Recommendation rollup information for all workload objects after analysis

DBA_ADVISOR_SQLA_WK_MAP

Workload references for all tasks

DBA_ADVISOR_SQLA_WK_STMTS

Workload objects after analysis

DBA_ADVISOR_SQLW_COLVOL

Not documented

DBA_ADVISOR_SQLW_JOURNAL

Journal entries for all workload

DBA_ADVISOR_SQLW_PARAMETERS

Workload parameters and their current values

DBA_ADVISOR_SQLW_STMTS

Rows that correspond to all statements in the workload

DBA_ADVISOR_SQLW_SUM

Aggregate picture of all SQL workload

DBA_ADVISOR_SQLW_TABLES

Cross-references between workload statements and tables references on it.

DBA_ADVISOR_SQLW_TABVOL

Not documented

DBA_ADVISOR_SQLW_TEMPLATES

Aggregate picture of all SQL workload template objects

DBA_ADVISOR_TASKS

Information about all tasks in the DB

DBA_ADVISOR_TEMPLATES

Information about all templates in the DB

DBA_ADVISOR_USAGE

Information for each type of advisor


SQL Tuning Advisor

This is feature automates the entire SQL tuning process. It analyzes the SQL statements and executes a complete analysis of the statements.

The advisor can be administered with the procedures in the DBMS_SQLTUNE package. To use the API the user must be granted DBA role, SELECT_CATALOG_ROLE role and the ADVISOR privilege.

Running SQL Tuning advisor using DBMS_SQLTUNE package is a two-step process. First you create a tuning task and then you execute the task. The CREATE_TUNING_TASK function returns the task name that you have provided or generates a unique task name. You can use the task name to specify this task when using other API’s.

Check for chaining rows

To see if you have chaining problems, run utlchain.sql script that Oracle provides and is in /$ORACLE_HOME/rdbms/admin subdirectory. You should check for chaining on a weekly basis and fix any problems immediately.

Analyze table customers;

List chained rows;

Then run the following query accessing the CHAINED_ROWS table to check the customer table chaining:

Select HEAD_ROWID from CHAINED_ROWS where TABLE_NAME = ‘CUSTOMER’;

If no rows are returned, then you don’t have a chaining problem. If there is a chaining problem then the query will return the head_rowid for all chained rows. You can also use the “count(*)” against the CHAINED_ROWS table to find the number of chained rows. In V$SYSSTAT, the “table fetch continue row” is an indicator of chained rows as well.

To avoid row chaining, set PCTFREE correctly, the default is set to 10.

Increasing the log file size and LOG_CHECKPOINT_INTERVAL for speed

If you want to speed up large numbers of INSERTs, UPDATEs, and DELETEs, increase the sizes of your log files and make sure they are on the fastest disk. Previously, you could also increase the LOG_CHECKPOINT_INTERVAL if it was set such that it would checkpoint prior to log switch, but this parameter is being deprecated and currently is set to zero (which indicates switching based on redo log being full).

Determining if redo log file size is a problem

There could be 2 problems with the size of the redo log files:

A) Batch jobs that do not have enough total redo space to complete or are so fast that the online redo logs wrap (cycle through all the logs and start writing to the first one again) before they are archived to the offline redo logs.

B) Very long running jobs that are spending a large amount of time switching online redo logs. Long running jobs are often much faster when the entire job fits into a single online redo log. For the online transaction processing (OLTP) type of environment, smaller online redo log are usually better. Optimal online redo log switch should happen every half hour (no counting the long running batch jobs that shorten this time)

Use the query below to determinate if there is a problem, it shows the time between log switches.

SQL> l

select b.recid,

to_char(b.first_time,'DD-MON-YY HH:MI:SS') start_time, a.recid,

to_char(a.first_time,'DD-MON-YY HH:MI:SS') end_time,

round(((a.first_time - b.first_time)*25)*60,2) minutes

from v$log_history a, v$log_history b

where a.recid = b.recid+1

order by a.first_time a


Determining the size of your log files and checkpoint interval

SQL> l

1 select a.member, b.*

2 from v$logfile a, v$log b

3* where a.group# = b.group#