Thursday, March 5, 2015

-- Oracle Compression Offerings --

I recently was approached by a consultant that was confused about the different compression offerings from Oracle, so I decided to write a high-level short summary to provide some clarity.

Oracle offers 3 types of Compression:

A) Basic Table and Index Compression (Good for Data Warehouses): This feature has many years in the market (at least 10 (Oracle 9i)) and it is part of Oracle Enterprise Edition (no additional license cost for you), when Oracle designed this feature the goal as the name of the feature says was to compress data to save space, however, what they found is that by compressing the data, and as a natural consequence you reduced the I/O, and if you add to it the fact that their compression / decompression algorithm is very light and simple (just consume negligent additional CPU amounts) the query performance often increases.

A.1) What are the disadvantages of it:  With this type of compression Oracle only compresses the data when you are inserting it into the table, and you need to insert it using “direct path insert”. The other disadvantage is that during updates Oracle needs to decompress the records and after updating them it does not compress it again, therefore you will end with a mix of compressed and uncompressed records. This type of compression is not good for OLTP databases, but ideal for Data Warehouses where you mostly do read-only transactions and you populate the data through an ETL tool doing the direct path insert.

A.2) What is direct path insert and what does this means: Direct path insert is a special mode if the insert where Oracle while inserting data will ignore the available free space on the table, in other words, it will not reuse the available free space on the table, it will do the insert after the existing data in the table, therefore wasting disk space, but increasing performance because it bypasses the buffer cache and writes directly to the data files.

A.3) What queries will see a performance advantage: Those queries that requires fetching millions of records and do aggregations where I/O is the bottleneck over tables that have a high compression ratio,  tables with small numbers of rows or small size will see adverse performance impact due to the CPU overhead.

A.4) What compression ratio can be achieved: It depends on the nature of the data, tables with duplicated data will achieve higher compression while tables with almost no data duplication will achieve less to none.

B) Oracle Advanced Compression (Available 11g and above) (a.k.a OLTP Compression): This is a licensed feature that can be implemented on top of Oracle Enterprise Edition only, it works similar to Basic table compression but it allows data to be compressed during all types of DML operations such as Insert or Updates, the compression algorithm is enhanced therefore reducing the overhead of write operations. The other advantage is that Oracle is able to read the rows without uncompressing them, therefore, there is almost no performance degradation in accessing compressed data, in fact in many cases you may see performance improvement due to the reduced I/O.


C) Hybrid Columnar compression (HCC): This is a feature available on Oracle Exadata, and recent rumors say that is available on the ODA X5-2. HCC uses an orchestration of Hardware and Software capabilities, it achieves higher compression rates than the other 2 methods, it has  4 different types of setting (Query high / Query Low/ Archive high / Archive low) that tells Oracle how you want  to balance the compression vs. performance impact; in other words the more compression the higher the CPU cost involved in performing operations, the lower compression the less CPU cost . Different from the other 2 methods this one stores the data in columnar format and that is the secret sauce for achieving higher compression, the disadvantage is that you could have adverse performance effects if you query more than one column at a time, although this is often offset by the use of the Exadata Smart scan feature. You still need to do bulk load of the information to maximize the compression ratio. Oracle includes an advisor that tells you what kind of compression you could expect to achieve.