Package with multidimensional AND sparse matrices?

Andreas Biegert
 Markus Mottl

Jon Harrop

Andreas Biegert

Jon Harrop
 Andreas Biegert
 Martin Jambon

Jon Harrop

Andreas Biegert
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:  20060704 (16:55) 
From:  Andreas Biegert <andreas.biegert@g...> 
Subject:  Re: [Camllist] Package with multidimensional AND sparse matrices? 
Since the algorithm computes alignemnts in most casses the pattern of access will be adjacent tuples, for example (i,j,k,l) (i,j,k1,l1) etc. But it does not have to be that way. The only contstraint that holds for sure is i<j<k<l. 2006/7/4, Jon Harrop <jon@ffconsultancy.com>: > On Tuesday 04 July 2006 08:46, Andreas Biegert wrote: > > Well, in short: the algorithm computes the optimal multiple alignment > > of up to n protein sequences with maximum length L. The > > straightforward and theoretically optimal approach to solve this > > problem is by dynamic programming, therefore the time and space > > complexity is O(L^n). I managed to come up with a pretty good > > heuristic that lets me boil down the number of cells that need to be > > computed giving me a complexitiy of O(L^2) instead of O(L^n), in fact > > I can even manually adjust the number of nonzero entries in the > > matrix. In the 2000*2000*2000*2000 matrix only at most 2000 cells will > > be nonzero. Furthermore for all nonzero cells M(i,j,k,l) holds > > i<j<k<l. The bad news is that performance is important since the > > program has to run under 5min. An estimation of the time requirements > > given the number of sequences n and masimum lengths L is: > > > > 24 * L^2 * T_x * 2^(n+1)  1 > > with T_x being about 100ns > > Ok. What patterns of access will there be? > > > For n=4 and L=2000 I get just about 5min. Would it be a bad idea to > > use Hashtables all the way instead of matrices? > > No, I don't think so. If you need more performance then you can always change > things later. I'd concentrate on getting a working algorithm first. > > > Accessing elements in > > a hashtable has O(1), right? > > Roughly, yes. :) > >  > Dr Jon D Harrop, Flying Frog Consultancy Ltd. > Objective CAML for Scientists > http://www.ffconsultancy.com/products/ocaml_for_scientists > > _______________________________________________ > Camllist mailing list. Subscription management: > http://yquem.inria.fr/cgibin/mailman/listinfo/camllist > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/camlbugs >