Version française
Home     About     Download     Resources     Contact us    
Browse thread
Parallelized parsing
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: David MENTRE <dmentre@l...>
Subject: Re: [Caml-list] Parallelized parsing
Hello Jon,

On Mon, Apr 20, 2009 at 23:15, Jon Harrop <jon@ffconsultancy.com> wrote:
> For example, Mathematica syntax for nested lists of integers looks like:
>
>  {{{1, 2}}, {{3, 4}, {4, 5}}, ..}
>
> and there are obvious divide-and-conquer approaches to lexing and parsing that
> grammar. You can recursively subdivide the string (e.g. memory mapped from a
> file) to build a tree of where the tokens { , and } appear by index and then
> recursively convert the tree into an AST.
>
> What other grammars can be lexed and/or parsed efficiently in parallel?

Is it of any use? The overhead of parsing a single file in parallel is
so high that you won't have any speedup, especially compared to the
much simpler approach of parsing *several* files in parallel.

It reminds me of parallel approaches used for 3D movies: a lot of
research has been done to parallelize the rendering of a single
picture[1] while companies like Pixar are using a much simpler
approach in real life: render a whole picture per computer or core.

And don't forget the Amdahl's law :
  http://en.wikipedia.org/wiki/Amdahl%27s_law
Where is your real bottleneck?

Yours,
david

[1] Hopefully, some of those algorithms have brought speedup in
serialized setting, i.e. on a single core or computer, for example by
optimizing cache use.