Tuesday, November 13, 2007

2nd Day at the Oracle Openworld

I am still trying to adjust to the jet lag, my internal clock is still in NJ or Bahamas where I was on vacation before getting here, never mind.

My first conference today was "Oracle Database 11g Business Intelligence and Data Warehousing by Ray Roccaforte". Ray spoke about the exorbitant grow of data volume, he mentioned Yahoo has a DW database of 250TB. he said that due to the increase in the amount of data you need more intelligent tools to interpret that data and that is where the business intelligence tools come to help. The oracle BI solution differentiates from other solutions in that it performs analytics inside the database therefore is faster.

Oracle provides:

Scalable Data management

ASM - Partitioning - Parallel Operations - Aggregation management - RAC

Analytic Platform

Multi dimensions Calculations - Time Series - Forecasting - Statistics - Data mining

ETL & Data Quality

Data Integration - Data Modeling - Meta data Management - Data Profiling - SOA


Oracle is launching a new initiative partnered with some of the main hardware providers like Dell, IBM and Sun and they are shipping along with the hardware the DW solution perfectly tunned and ready to use out of the box.

In 11g there are performance enhancements for Oracle partitioning such as interval partitioning that I mentioned yesterday, there are new types of composite partitioning (range - range , list - range , list - hash) , you can also partition on virtual columns (on expression values) , you also have reference partitioning (equi-partition a FK table with a PK table). The advantage of partition interval is that it creates the partition on the fly when the first row is inserted, for example:

create table sh.sales partition by range(time_id) interval (numtomyinterval (1,'month'),

Also in 11g there is a new feature called "SQL Query result cache" , I think I mentioned this in my previous blog from yesterday but basically you cache the results of a query and it can be used by subsequent queries that can make use of it. For example you may have an expensive group by part of a query, that group by can be used by subsequent queries that may need of it.

Also in 11g a single cube provides equivalence to thousand of materialized views, The 11g SQL query optimizer treats OLAP cubes as materialized views. what is not clear to me is if I need to buy OLAP or they are allowing me to create cubes as part of EE base.

My second conference was "Oracle 11g Real application testing: Avoid performance regressions with SQL performance analyzer by Benoit Dageville" Benoit mentioned that the new real application testing (RAT) has 2 components and is a package of Grid control that you can access for an extra fee.

RAT - SQL Performance Analyzer
- Database replay.

You use the SQL performance analyzer to do unit testing it, helps you to predict problems before it happen.

You use the database replay feature for comprehensive testing and replay real workload.

The SQL performance analyzer offers automatic data capture with almost no overhead, it offers you production sql context and complete sql workload as well as automatic analysis in minutes. You can later use the automatic regression tuning to improve your sql.

The cycle of RAT is as follow:

You capture your sql in production -> transfer the sql to test -> execute the sql pre-change -> set changes -> execute replay -> compare performance -> do the production changes.

Creating sql plan base lines helps to revert to previously known stable plans. SQL Tunning adv visor helps to explore better execution plans. You can leverage SQL performance analyzer 11g (SPA) to determinate how changes will affect you when you move to 11g, even if your database is a 10g database and you don't have a 11g database yet (optimizer upgrade simulator).

My third session was "Oracle 11g Data pump: Data compression, encryption and more by Carol Palmer" Carol started by giving an overview of data pump, and then focused in the following new features on 11g.

- Data compression
- Dump file encryption
- Partition base transportable tablespace

she mentioned that the original import (imp) will be around still in future versions but the export as we know it is not longer on 11g. Oracle is pushing everybody to move to data pump. Data pump import (impdp) is 15-40 times faster than the regular import and expdp (Data pump export) is twice faster than the regular export.

The new data compression feature compress data and meta data or both depending of what you want, it is an inline operation meaning that while you import of export the data gets compressed and decompressed contrary to the traditional two step extraction and compression. The speed of compression is as fast as traditional utilities like gzip and the compression ration is similar as well. Compression adds a 10% overhead versus non compressed import and export.

Data pump encryption offers password base encryption, you can use the Oracle wallet or you can use both. it is also an inline operation and works in conjunction with Oracle transparent data encryption, the default encryption algorithm is AES125.

Carol briefly spoke about partition base transportable tablespace, no much details on it. She also spoke about data transform (remap_data) , you basically create a function to obscure sensitive data and the translation is done once you import in your test system, in that way you can protect columns such as social security numbers, salaries etc from the developers or anyone with access to development. You can remap several columns for several tables at a time, the disadvantage I see is that you need to create a function for every column you want to remap.

Carol also spoke about "secure files" they basically are the new lobs but with better performance , you can compress them, you can encrypt them and they are supported by data pump.

My last session and my favorite for the day was "The top 10 new features of Oracle 11g by Thomas Kyte" . Tom gave a preamble saying that it was very easy to see the data out of database files and that you could find in sourceforge.com plenty of data unloaders that can read Oracle database files and that is why you need to encrypt the data.

1. - Encrypted tablespace: Column encryption as been there since 10g R2, however you could not use range scan and you had primary / foreign key issues when one or the other was not encrypted. Also sometimes you do not know what data to encrypt. Tablespace encryption solves all those limitations.

2. - Oracle 11g does more cache: You have client side cache, server result cache (JIT - MV) and PL/SQL functions result cache. There is a result_cache hint that you can use to tell Oracle to cache or not the results. You can also ask for the result of a function to be cache using the "create or replace function cache" statement.

3. - Standby just got better: Logical standby has been always limited in type support but always open for business, while the physical standby is easier but considered not helpful for day to day operation. Now the logical standby has XML type support and the physical standby can be used for destructive testing, in other words you can open it do the operations you need then bring it back to the original state using flash back and place it again in standby mode. Another improvement is that even that you are querying the database the redo is continuously been applied. This is great because you can offload your reports to the standby database without impacting the synchronization with your primary.

4. - Real application testing: I mentioned this before, Tom did not mentioned much about this other than it has 2 components: capture workload and replay DB workload.

5. - Data pump compression and encryption: Tom mentioned the new partition options (none - departition - merge) and you can also reuse dump files (reuse_dumpfiles).

6. - Virtual columns: They are column expressions involving other columns in the table or constants.

7. - Partitioning just got better: Now Oracle covers all the set of composite partitions and you can have virtual column partition, you can partition by reference and you can use interval partitioning which was mentioned before.

8. - Long waited pivot columns: now Oracle implemented a new clause called pivot to display pivot columns.

9. - Flash back data archive: For years customers have wanted to go back in time more than just few hours or a day, well because of the architecture of flash back that was not possible (you can't keep years of undo inline) , now with the new flash back data archive you can do total data recall or table by table.

10. - Finer grained dependency tracking: In 11g you have less objects invalidations even if you specifications to a procedure or add / modify or drop a column, or if you modify a synonym pointer.

11. - OLTP table compression: Oracle advance compression now has a 3x ratio.

Personal Comment: it seems that in 11g Oracle rather than getting crazy adding features that people may not use, they took an step back and matured and improved what they already have which I think is great.

More tomorrow...

No comments: