Oracle semi-join with multiple tables in SQL subquery -
this question how work around apparent oracle limitation on semi-joins multiple tables in subquery. have following 2 update statements.
update 1:
update (select a.flag update_column a, b a.id = b.id , exists (select null c c.id2 = b.id2 , c.time between start_in , end_in) , exists (select null table(update_in) d b.time between d.start_time , d.end_time)) set update_column = 'f' the execution plan indicayes correctly performs 2 semi-joins, , update executes in seconds. these need semi-joins because c.id2 not unique foreign key on b.id2, unlike b.id , a.id. , update_in doesn't have constraints @ since it's array.
update 2:
update (select a.flag update_column a, b a.id = b.id , exists (select null c, table(update_in) d c.id2 = b.id2 , c.time > d.time , b.time between d.start_time , d.end_time)) set update_column = 'f' this not semi-join; believe based on oracle documentation that's because exists subquery has 2 tables in it. due sizes of tables, , partitioning, update takes hours. however, there no way relate d.time associated d.start_time , d.end_time other being on same row. , reason pass in update_in array , join here because running query in loop each time/start_time/end_time combination proved give poor performance.
is there reason other 2 tables semi-join not working? if not, there way around limitation? simple solution missing make these criteria work without putting 2 tables in subquery?
as bob suggests can use global temporary table (gtt) same structure update_in array, key difference can create indexes on gtt, , if populate gtt representative sample data, can collect statistics on table sql query analyzer better able predict optimal query plan.
that said there other notable differences in 2 queries:
- in first exists clause of first query refer 2 columns start_in , end_in don't have table references. guess either columns in table or b, or variables within current scope of sql statement. it's not clear which.
- in second query refer column d.time, however, don't use column in first query.
does updating second query following improve it's performance?
update (select a.flag update_column a, b a.id = b.id , exists (select null c, table(update_in) d c.id2 = b.id2 , c.time between start_in , end_in , c.time > d.time , b.time between d.start_time , d.end_time)) set update_column = 'f'
Comments
Post a Comment