performance - Scalable Contains method for LINQ against a SQL backend -


i'm looking elegant way execute contains() statement in scalable way. please allow me give background before come actual question.

the in statement

in entity framework , linq sql contains statement translated sql in statement. instance, statement:

var ids = enumerable.range(1,10); var courses = courses.where(c => ids.contains(c.courseid)).tolist(); 

entity framework generate

select      [extent1].[courseid] [courseid],      [extent1].[title] [title],      [extent1].[credits] [credits],      [extent1].[departmentid] [departmentid]     [dbo].[course] [extent1]     [extent1].[courseid] in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) 

unfortunately, in statement not scalable. per msdn:

including extremely large number of values (many thousands) in in clause can consume resources , return errors 8623 or 8632

which has running out of resources or exceeding expression limits.

but before these errors occur, in statement becomes increasingly slow growing numbers of items. can't find documentation growth rate, performs few thousands of items, beyond gets dramatically slow. (based on sql server experiences).

scalable

we can't avoid statement. join source data in stead perform better, that's possible when source data in same context. here i'm dealing data coming client in disconnected scenario. have been looking scalable solution. satisfactory approach turned out cutting operation chunks:

var courses = ids.tochunks(1000)                  .select(chunk => courses.where(c => chunk.contains(c.courseid)))                  .selectmany(x => x).tolist(); 

(where tochunks this little extension method).

this executes query in chunks of 1000 perform enough. e.g. 5000 items, 5 queries run faster 1 query 5000 items.

but not dry

but of course don't want scatter construct on code. looking extension method iqueryable<t> can transformed chunky executing statement. ideally this:

var courses = courses.where(c => ids.contains(c.courseid))               .aschunky(1000)               .tolist(); 

but maybe this

var courses = courses.chunkycontains(c => c.courseid, ids, 1000)               .tolist(); 

i've given latter solution first shot:

public static ienumerable<tentity> chunkycontains<tentity, tcontains>(     iqueryable<tentity> query,      expression<func<tentity,tcontains>> match,      ienumerable<tcontains> containlist,      int chunksize = 500) {     return containlist.tochunks(chunksize)                .select (chunk => query.where(x => chunk.contains(match)))                .selectmany(x => x); } 

obviously, part x => chunk.contains(match) doesn't compile. don't know how manipulate match expression contains expression.

maybe can me make solution work. , of course i'm open other approaches make statement scalable.

i’ve solved problem little different approach view month ago. maybe it’s solution too.

i didn’t want solution change query itself. ids.chunkcontains(p.id) or special wherecontains method unfeasible. should solution able combine contains filter using same collection multiple times.

db.testentities.where(p => (ids.contains(p.id) || ids.contains(p.parentid)) && p.name.startswith("test")) 

so tried encapsulate logic in special tolist method rewrite expression specified collection queried in chunks.

var ids = enumerable.range(1, 11); var result = db.testentities.where(p => ids.contains(p.id) && p.name.startswith ("test"))                                 .tochunkedlist(ids,4); 

to rewrite expression tree discovered contains method calls local collections in query view helping classes.

private class containsexpression {     public containsexpression(methodcallexpression methodcall)     {         this.methodcall = methodcall;     }      public methodcallexpression methodcall { get; private set; }      public object getvalue()     {         var parent = methodcall.object ?? methodcall.arguments.firstordefault();         return expression.lambda<func<object>>(parent).compile()();     }      public bool islocallist()     {         expression parent = methodcall.object ?? methodcall.arguments.firstordefault();         while (parent != null) {             if (parent constantexpression)                 return true;             var member = parent memberexpression;             if (member != null) {                 parent = member.expression;             } else {                 parent = null;             }         }         return false;     } }  private class findexpressionvisitor<t> : expressionvisitor t : expression {     public list<t> founditems { get; private set; }      public findexpressionvisitor()     {         this.founditems = new list<t>();     }      public override expression visit(expression node)     {         var found = node t;         if (found != null) {             this.founditems.add(found);         }         return base.visit(node);     } }  public static list<t> tochunkedlist<t, tvalue>(this iqueryable<t> query, ienumerable<tvalue> list, int chunksize) {     var finder = new findexpressionvisitor<methodcallexpression>();     finder.visit(query.expression);     var methodcalls = finder.founditems.where(p => p.method.name == "contains").select(p => new containsexpression(p)).where(p => p.islocallist()).tolist();     var locallists = methodcalls.where(p => p.getvalue() == list).tolist(); 

if local collection passed in tochunkedlist method found in query expression, replace contains call original list new call temporary list containing ids 1 batch.

if (locallists.any()) {     var result = new list<t>();     var valuelist = new list<tvalue>();      var containsmethod = typeof(enumerable).getmethods(bindingflags.static | bindingflags.public)                         .single(p => p.name == "contains" && p.getparameters().count() == 2)                         .makegenericmethod(typeof(tvalue));      var queryexpression = query.expression;      foreach (var item in locallists) {         var parameter = new list<expression>();         parameter.add(expression.constant(valuelist));         if (item.methodcall.object == null) {             parameter.addrange(item.methodcall.arguments.skip(1));         } else {             parameter.addrange(item.methodcall.arguments);         }          var call = expression.call(containsmethod, parameter.toarray());          var replacer = new expressionreplacer(item.methodcall,call);          queryexpression = replacer.visit(queryexpression);     }      var chunkquery = query.provider.createquery<t>(queryexpression);       (int = 0; < math.ceiling((decimal)list.count() / chunksize); i++) {         valuelist.clear();         valuelist.addrange(list.skip(i * chunksize).take(chunksize));          result.addrange(chunkquery.tolist());     }     return result; } // if collection not found return query.tolist() return query.tolist(); 

expression replacer:

private class expressionreplacer : expressionvisitor {      private expression find, replace;      public expressionreplacer(expression find, expression replace)     {         this.find = find;         this.replace = replace;     }      public override expression visit(expression node)     {         if (node == this.find)             return this.replace;          return base.visit(node);     } } 

Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -