Package with multidimensional AND sparse matrices?
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:  20060704 (14:42) 
From:  Jon Harrop <jon@f...> 
Subject:  Re: [Camllist] Package with multidimensional AND sparse matrices? 
On Tuesday 04 July 2006 08:46, Andreas Biegert wrote: > Well, in short: the algorithm computes the optimal multiple alignment > of up to n protein sequences with maximum length L. The > straightforward and theoretically optimal approach to solve this > problem is by dynamic programming, therefore the time and space > complexity is O(L^n). I managed to come up with a pretty good > heuristic that lets me boil down the number of cells that need to be > computed giving me a complexitiy of O(L^2) instead of O(L^n), in fact > I can even manually adjust the number of nonzero entries in the > matrix. In the 2000*2000*2000*2000 matrix only at most 2000 cells will > be nonzero. Furthermore for all nonzero cells M(i,j,k,l) holds > i<j<k<l. The bad news is that performance is important since the > program has to run under 5min. An estimation of the time requirements > given the number of sequences n and masimum lengths L is: > > 24 * L^2 * T_x * 2^(n+1)  1 > with T_x being about 100ns Ok. What patterns of access will there be? > For n=4 and L=2000 I get just about 5min. Would it be a bad idea to > use Hashtables all the way instead of matrices? No, I don't think so. If you need more performance then you can always change things later. I'd concentrate on getting a working algorithm first. > Accessing elements in > a hashtable has O(1), right? Roughly, yes. :)  Dr Jon D Harrop, Flying Frog Consultancy Ltd. Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists