Browse thread
Parallelized parsing
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | David MENTRE <dmentre@l...> |
| Subject: | Re: [Caml-list] Parallelized parsing |
Hello Jon,
On Mon, Apr 20, 2009 at 23:15, Jon Harrop <jon@ffconsultancy.com> wrote:
> For example, Mathematica syntax for nested lists of integers looks like:
>
> {{{1, 2}}, {{3, 4}, {4, 5}}, ..}
>
> and there are obvious divide-and-conquer approaches to lexing and parsing that
> grammar. You can recursively subdivide the string (e.g. memory mapped from a
> file) to build a tree of where the tokens { , and } appear by index and then
> recursively convert the tree into an AST.
>
> What other grammars can be lexed and/or parsed efficiently in parallel?
Is it of any use? The overhead of parsing a single file in parallel is
so high that you won't have any speedup, especially compared to the
much simpler approach of parsing *several* files in parallel.
It reminds me of parallel approaches used for 3D movies: a lot of
research has been done to parallelize the rendering of a single
picture[1] while companies like Pixar are using a much simpler
approach in real life: render a whole picture per computer or core.
And don't forget the Amdahl's law :
http://en.wikipedia.org/wiki/Amdahl%27s_law
Where is your real bottleneck?
Yours,
david
[1] Hopefully, some of those algorithms have brought speedup in
serialized setting, i.e. on a single core or computer, for example by
optimizing cache use.