Why does this Rascal pattern matching code use so much memory and time? -


i'm trying write think of extremely simple piece of code in rascal: testing if list contains list b.

starting out basic code create list of strings

public list[str] makestringlist(int start, int end) {     return [ "some string number <i>" | <- [start..end]]; }  public list[str] totest = makestringlist(0, 200000);  

my first try 'inspired' sorting example in tutor:

public void findclone(list[str] in,  str s1, str s2, str s3, str s4, str s5, str s6) {     switch(in)     {         case [*str head, str i1, str i2, str i3, str i4, str i5, str i6, *str tail]:            {             if(s1 == i1 && s2 == i2 && s3 == i3 && s4 == i4 && s5 == i5 && s6 == i6)             {                 println("found duplicate\n\t<i1>\n\t<i2>\n\t<i3>\n\t<i4>\n\t<i5>\n\t<i6>");             }             fail;          }             default:             return;     } } 

not pretty, expected work. unfortunately, code runs 30 seconds before crashing "out of memory" error.

i tried better looking alternative:

public void findclone2(list[str] in, list[str] whatwesearchfor) {     ([*str head, *str mid, *str end] := in)     if (mid == whatwesearchfor)         println("gotcha"); }  

with approximately same result (seems run little longer before running out of memory)

finally, tried 'good old' c-style approach for-loop

public void findclone3(list[str] in, list[str] whatwesearchfor) {     clonelength = size(whatwesearchfor);     inputlength = size(in);      if(inputlength < clonelength) return [];      looplength = inputlength - clonelength + 1;      for(int <- [0..looplength])     {         isaclone = true;         for(int j <- [0..clonelength])         {             if(in[i+j] != whatwesearchfor[j])                 isaclone = false;         }          if(isaclone) println("found clone <whatwesearchfor> on lines <i> through <i+clonelength-1>");        } } 

to surprise, 1 works charm. no out of memory, , results in seconds.

i first 2 attempts create lot of temporary string objects have garbage collected, can't believe solution worked best solution.

any pointers appreciated.

my relevant eclipse.ini settings are

-xx:maxpermsize=512m -xms512m -xss64m -xmx1g 

we'll need see why happening. note that, if want use pattern matching, maybe better way write it:

public void findclone(list[str] in,  str s1, str s2, str s3, str s4, str s5, str s6) {     switch(in) {         case [*str head, s1, s2, s3, s4, s5, s6, *str tail]: {             println("found duplicate\n\t<s1>\n\t<s2>\n\t<s3>\n\t<s4>\n\t<s5>\n\t<s6>");          }          default:              return;      }  } 

if this, taking advantage of rascal's matcher find matching strings directly, versus first example in string match needed use number of separate comparisons see if match represented combination looking for. if run on 110145 through 110150 takes while works , doesn't seem grow beyond heap space allocated it.

also, there reason using fail? continue searching?


Comments

Popular posts from this blog

matlab - error with cyclic autocorrelation function -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

c# - What is a good .Net RefEdit control to use with ExcelDna? -