Oracle semi-join with multiple tables in SQL subquery -

May 15, 2012

this question how work around apparent oracle limitation on semi-joins multiple tables in subquery. have following 2 update statements.

update 1:

update      (select a.flag update_column       a, b       a.id = b.id ,             exists (select null                     c                     c.id2 = b.id2 ,                           c.time between start_in , end_in) ,             exists (select null                     table(update_in) d                     b.time between d.start_time , d.end_time)) set update_column = 'f'

the execution plan indicayes correctly performs 2 semi-joins, , update executes in seconds. these need semi-joins because c.id2 not unique foreign key on b.id2, unlike b.id , a.id. , update_in doesn't have constraints @ since it's array.

update 2:

update      (select a.flag update_column       a, b       a.id = b.id ,             exists (select null                     c, table(update_in) d                     c.id2 = b.id2 ,                           c.time > d.time ,                           b.time between d.start_time , d.end_time)) set update_column = 'f'

this not semi-join; believe based on oracle documentation that's because exists subquery has 2 tables in it. due sizes of tables, , partitioning, update takes hours. however, there no way relate d.time associated d.start_time , d.end_time other being on same row. , reason pass in update_in array , join here because running query in loop each time/start_time/end_time combination proved give poor performance.

is there reason other 2 tables semi-join not working? if not, there way around limitation? simple solution missing make these criteria work without putting 2 tables in subquery?

as bob suggests can use global temporary table (gtt) same structure update_in array, key difference can create indexes on gtt, , if populate gtt representative sample data, can collect statistics on table sql query analyzer better able predict optimal query plan.

that said there other notable differences in 2 queries:

in first exists clause of first query refer 2 columns start_in , end_in don't have table references. guess either columns in table or b, or variables within current scope of sql statement. it's not clear which.
in second query refer column d.time, however, don't use column in first query.

does updating second query following improve it's performance?

update      (select a.flag update_column       a, b       a.id = b.id ,             exists (select null                     c, table(update_in) d                     c.id2 = b.id2 ,                           c.time between start_in , end_in ,                           c.time > d.time ,                           b.time between d.start_time , d.end_time)) set update_column = 'f'

Search This Blog

Script

Oracle semi-join with multiple tables in SQL subquery -

Comments

Post a Comment

Popular posts from this blog

android - Sent Blob results empty -

javascript - Bootstrap Popover: iOS Safari strange behaviour -

ruby - How to configure keymap of Rubymine for rails console -