Browse thread
[Caml-list] ocaml-3.05: a performance experience
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2002-08-03 (12:34) |
From: | Gerd Stolpmann <info@g...> |
Subject: | Re: [Caml-list] ocaml-3.05: a performance experience |
On 2002.08.02 05:33 Alexander V. Voinov wrote: > Hi All, > > I have an application, which parses a huge XML file and stores resulting > records to a database. > > The file is parsed using PXP, but in a 'pulldom' manner, by extracting > (to a Buffer) first level tags manually with pcre, then an array insert > of 30000 recognized and accumulated records is performed. DB access > takes a small fraction of the run time. > > Compiled with ocaml-3.04 it took 1h40m+-5m of 'user' process time and > occupied about 340M in RAM. With 3.05 it took 2h40m+-5m and occupied > 250M. > > Is this the consequence of the new GC strategy? Actually I'd tolerate > large footprint for the sake of more speed. > > It's also interesting to note, than in the case of 3.04 the footprint of > the application starts from 330M and slowly expands to 350M. With 3.05 > it starts with 250M and then almost does not expand till the end. > > Sparc Solaris 2.7, gcc 3.0.4. > > A previous version of this app, written in Python with PyXML, runs 3-4 > times slower than the 3.04 version and takes 20M in RAM. I think you observe GC compaction. You can turn it off: OCAMLRUNPARAM="O=1000000" (or Gc.set). If XML validation is not needed, you could also rewrite your program to use the new event-based parsing in PXP-1.1.90. That would completely avoid to represent the XML tree in memory (and increase the speed, because GC of large memory footprints is expensive). Gerd -- ---------------------------------------------------------------------------- Gerd Stolpmann Telefon: +49 6151 997705 (privat) Viktoriastr. 45 64293 Darmstadt EMail: gerd@gerd-stolpmann.de Germany ---------------------------------------------------------------------------- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners