Optimizing Oracle SQL Queries: A Step-by-Step Guide

Understanding the Challenge

The provided Stack Overflow post presents a challenge related to optimizing a SQL query in Oracle. The goal is to retrieve last names and dates from a database table using a combination of two subqueries, one for orders with header information (ord_odb_l) and another for distribution details (distrb_l).

The Original Query

The original query utilizes the NVL function to select the desired columns. However, it contains an error due to missing parentheses in one of the subqueries.

(select last_name||' '||name||'   '||to_char(datum_p,'DD.MM.YYYY') 
from 
  (
   NVL(
    (select co.kod_id, bu.last_name, bu.name, cl.datum_p, ctl.rid, md.short_name
     from ord_odb_l cl, ord_odb_o co, b_users bu, ct_l ctl, m_delivery md 
     where co.rid_o = cl.rid and bu.id = cl.operator and cl.delivery_place = md.rid 
     and cl.operator not IN (161,245,46,120,43,184) order by cl.datum_p desc),

     (select vo.kod_id, bu.last_name, bu.name, vl.datum_p, ctl.rid, md.short_name 
     from distrb_l vl, distrb_o vo, b_users bu, ct_l ctl, m_delivery md
     where vo.rid_o = vl.rid and bu.id = cl.operator and cl.delivery_place = md.rid 
     and cl.operator not IN (161,245,46,120,43,184) order by vl.datum_p desc)
      )
   ) aa 
where aa.kod_id = orders.SK_ID and aa.rid = orders.rid_ct_a 
and aa.short_name = orders.md and aa.datum_p > orders.datum_ok and rownum = 1

The Proposed Solution

The proposed solution involves using the UNION ALL operator to combine the results of both subqueries. This ensures that rows from one table have higher priority over rows from another.

(select last_name||' '||name||'   '||to_char(datum_p,'DD.MM.YYYY') 
  from (
    select 1 src, co.kod_id, bu.last_name, bu.name, cl.datum_p, ctl.rid, md.short_name
      from ord_odb_l  cl
      join ord_odb_o  co on co.rid_o = cl.rid 
      join b_users    bu on bu.id = cl.operator 
      cross join ct_l ctl
      join m_delivery md on cl.delivery_place = md.rid
      where cl.operator not IN (161,245,46,120,43,184) 
    union all 
    select 2 src, vo.kod_id, bu.last_name, bu.name, vl.datum_p, ctl.rid, md.short_name 
      from distrb_l   vl
      join distrb_o   vo on vo.rid_o = vl.rid
      join b_users    bu on bu.id = cl.operator
      cross join ct_l ctl
      join m_delivery md on cl.delivery_place = md.rid
      where cl.operator not IN (161,245,46,120,43,184) 
      order by src, datum_p desc) aa 
  where aa.kod_id = orders.SK_ID and aa.rid = orders.rid_ct_a 
    and aa.short_name = orders.md and aa.datum_p > orders.datum_ok and rownum = 1)

Optimizing the Query

The proposed solution provides a good starting point for optimizing the query. However, there are some additional considerations that can be made to improve performance.

Joining Tables

As mentioned in the proposal, rows from one table have higher priority over rows from another. This suggests that the code is trying to prioritize orders with header information (ord_odb_l) over distribution details (distrb_l). To confirm this, additional analysis of the database schema and structure may be necessary.

Using FETCH FIRST ROW ONLY

In Oracle 12c, you can use the FETCH FIRST ROW ONLY clause to retrieve only the first row from a query. This can potentially improve performance by reducing the amount of data being returned.

select last_name||' '||name||'   '||to_char(datum_p,'DD.MM.YYYY') 
  from (
    select 1 src, co.kod_id, bu.last_name, bu.name, cl.datum_p, ctl.rid, md.short_name
      from ord_odb_l  cl
      join ord_odb_o  co on co.rid_o = cl.rid 
      join b_users    bu on bu.id = cl.operator 
      cross join ct_l ctl
      join m_delivery md on cl.delivery_place = md.rid
      where cl.operator not IN (161,245,46,120,43,184) 
    union all 
    select 2 src, vo.kod_id, bu.last_name, bu.name, vl.datum_p, ctl.rid, md.short_name 
      from distrb_l   vl
      join distrb_o   vo on vo.rid_o = vl.rid
      join b_users    bu on bu.id = cl.operator
      cross join ct_l ctl
      join m_delivery md on cl.delivery_place = md.rid
      where cl.operator not IN (161,245,46,120,43,184) 
      order by src, datum_p desc) aa 
  where aa.kod_id = orders.SK_ID and aa.rid = orders.rid_ct_a 
    and aa.short_name = orders.md and aa.datum_p > orders.datum_ok

Indexing

In addition to optimizing the query itself, indexing can also play a significant role in improving performance. By creating indexes on columns used in the WHERE clause, you can reduce the amount of data being scanned and improve query execution time.

CREATE INDEX idx_ord_odb_l_cl_rid_o ON ord_odb_l (rid_o);
CREATE INDEX idx_distrb_l_vl_rid ON distrb_l (rid_v);

Partitioning

If the tables are large, partitioning can also be an effective way to improve performance. By dividing the data into smaller, more manageable chunks, you can reduce the amount of data being scanned and improve query execution time.

CREATE TABLE ord_odb_l PARTITION BY RANGE (datum_p) (
  PARTITION P20210101 VALUES LESS THAN (TO_DATE('2021-02-01', 'YYYY-MM-DD')),
  PARTITION P20210201 VALUES LESS THAN (TO_DATE('2021-03-01', 'YYYY-MM-DD')),
  -- ... other partitions ...
);

By considering these additional factors, it’s possible to further optimize the query and improve performance.


Last modified on 2023-11-30