Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] @, List.append, and tail recursion
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Diego Olivier Fernandez Pons <Diego-Olivier.FERNANDEZ-PONS@c...>
Subject: Linear systems (was Re: [Caml-list] @, List.append, and tail recursion)
    Bonjour,

Sorry for the buggy code, anyway you should be able to fix it.

> In production I wouldn't be surprised to see systems of 30,000+
> equations in 30,000+ variables.  The Jacobian matrix is going to be
> very sparse- I expect the average row to have 20 or fewer non-zero
> elements.  Thus the attraction to sparse vectors.  I'm going to be
> solving it via Gaussian elimination (the Jacobian is likely to be
> malformed in multiple ways, meaning I can't use any iterative method
> I know of.  And yes, I've looked at iterative methods as advanced as
> GMRES- they don't work).  I think that in general I can bound the
> size of vectors I'm producing.  But there are degenerate cases where
> I could get above 30K non-zero elements in a vector.

A 30 000 x 30 000 system of equations is not a toy program. Then, do
not expect to solve it straightforwardly. You will obviously hit all
system limits (stack, cache, ...). You have to understand that if you
really want a robuts program in all cases, you will have to work :

You will have to design several data structures for every operation
you want (to keep both efficiency and generality) and transformation
functions. That is the case for example in the Gröbner basis system I
pointed out : several monomial orderings with many transformation
functions ...

You will have to use all Caml properties (functional, imperative ...)

You also need to understand roughtly how does the Caml compiler work
(boxing/ unboxing, garbage collection, ...)

> But I want the rare case to *work* correctly, even if inefficiently.  With
> the "naive" non-tail-recursive implementation doesn't.  Somewhere above
> 32K elements in the list the recursion trips the stack overflow.  Change
> the problem in some minor way, and suddenly I'm not generating lists with
> 32K elements in them, maybe just 30K elements in them, and everything
> works OK.

Your main problem is that sometimes your lists may be too big for the
data structure you have chosen. Then, the best thing to do is to have
two data structures, one for small (most of all) systems, a second one
for huge (but rare) systems.

- lists for small sparse vectors
- (say) hashtables for huge sparse vectors

You just need to write you own hashtable data structure (because you
can then profit of Caml's float array unboxing optimisation by
separate collision resolution)

type size = int

type vector =
  | List of size * (int * float) list
  | Hashtable of myhashtbl

Most of your code will be purely functional and the program will not
break on large data.

One more point : floating arithmetic may produce incorrect value by
error propagation, even for small systems. If you do want you system
to work properly for all data (even not well scaled), you will have to
use specific algorithms (pivot choice rules, error bounding, ...).
Then, list representation may not be apropriate.

> O(1), or O(log n)?  Most tree operations are O(log n).

It is easy to design a data structure with O(log n) acces to any
element and O(1) acces to the first one : imagine a list of increasing
perfect trees of size 2^k

        Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners