Browse thread
Package with multidimensional AND sparse matrices?
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2006-07-04 (14:42) |
From: | Jon Harrop <jon@f...> |
Subject: | Re: [Caml-list] Package with multidimensional AND sparse matrices? |
On Tuesday 04 July 2006 08:46, Andreas Biegert wrote: > Well, in short: the algorithm computes the optimal multiple alignment > of up to n protein sequences with maximum length L. The > straightforward and theoretically optimal approach to solve this > problem is by dynamic programming, therefore the time and space > complexity is O(L^n). I managed to come up with a pretty good > heuristic that lets me boil down the number of cells that need to be > computed giving me a complexitiy of O(L^2) instead of O(L^n), in fact > I can even manually adjust the number of nonzero entries in the > matrix. In the 2000*2000*2000*2000 matrix only at most 2000 cells will > be non-zero. Furthermore for all non-zero cells M(i,j,k,l) holds > i<j<k<l. The bad news is that performance is important since the > program has to run under 5min. An estimation of the time requirements > given the number of sequences n and masimum lengths L is: > > 24 * L^2 * T_x * 2^(n+1) - 1 > with T_x being about 100ns Ok. What patterns of access will there be? > For n=4 and L=2000 I get just about 5min. Would it be a bad idea to > use Hashtables all the way instead of matrices? No, I don't think so. If you need more performance then you can always change things later. I'd concentrate on getting a working algorithm first. > Accessing elements in > a hashtable has O(1), right? Roughly, yes. :-) -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists