Browse thread
Parallelized parsing
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2009-04-21 (07:19) |
From: | David MENTRE <dmentre@l...> |
Subject: | Re: [Caml-list] Parallelized parsing |
Hello Jon, On Mon, Apr 20, 2009 at 23:15, Jon Harrop <jon@ffconsultancy.com> wrote: > For example, Mathematica syntax for nested lists of integers looks like: > > {{{1, 2}}, {{3, 4}, {4, 5}}, ..} > > and there are obvious divide-and-conquer approaches to lexing and parsing that > grammar. You can recursively subdivide the string (e.g. memory mapped from a > file) to build a tree of where the tokens { , and } appear by index and then > recursively convert the tree into an AST. > > What other grammars can be lexed and/or parsed efficiently in parallel? Is it of any use? The overhead of parsing a single file in parallel is so high that you won't have any speedup, especially compared to the much simpler approach of parsing *several* files in parallel. It reminds me of parallel approaches used for 3D movies: a lot of research has been done to parallelize the rendering of a single picture[1] while companies like Pixar are using a much simpler approach in real life: render a whole picture per computer or core. And don't forget the Amdahl's law : http://en.wikipedia.org/wiki/Amdahl%27s_law Where is your real bottleneck? Yours, david [1] Hopefully, some of those algorithms have brought speedup in serialized setting, i.e. on a single core or computer, for example by optimizing cache use.