Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] Need NFA/DFA conversion help
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Diego Olivier Fernandez Pons <Diego.FERNANDEZ_PONS@e...>
Subject: Re: [Caml-list] Need NFA/DFA conversion help
    Bonjour,


It seems there are two different problems addressed in your request :
the first one is purely technical and the second one is more
theoretical.

> like Ocamllex, except you don't need any external tool or library,
> you can define everything 'in-caml' and dynamically, as shown.
> (The core of the matcher engine is also purely functional)

Tools like Lex or Yacc build an 'interpreter' (NFA, FDA, push-down
automata or transducers) which interprets your input language and
'does' something.

What you are asking is this set of tools to be done in a dynamic and
pure-caml way instead of a 2-stages mixed C/Caml. This is a technical
issue : just take the same principles that were used to build this
tools and replace the code generation by data structures construction
(sounds simple, doesn'it ?)

> Then you can either process some data, or you can save the compiled
> automaton (possibly as Ocaml or C sourcecode which can be compiled
> directly to executable code like (Ocaml)lex output).

Yes, one can imagine a generic framework that would allow you to
choose how your intepreter will be generated (source-code/data
structure, C/Caml, etc.)

> What doesn't work is sub-group matching, look ahead, etc -- things
> requiring actions in the middle of a regexp (rather than at the
> end).

Here is the second problem. You did exactly what was expected (taking
a textbook on formal languages and implement what is said). The
problem is that tools like PCRE, Lex, Yacc, etc. do not work as
textbook say. They do not accept rational or context-free languages,
they do not use DFA, NFA or LR(k) automata as claimed.

Yacc input language is not a context-free language but some kind of
restricted attribute grammar that lies somewhere between algebraic
transducers and contextual grammars. Hence tools like Bison will
contain a lot of strange hacks to handle all these extensions while
keeping the push-down automata machinery.

In the same way a grammar that admits 'actions in the middle of a
regexp [] rather than at the end' could hardly qualify as a 'rational
language' which is what regexp theoretically describe.

As far as I know, there is no simple way to handle all these useful
'extensions' and theoretically satisfactory ways tend to be very hard.

- general textbook

Parsing techniques (Dick Grune, Ceriel J.H. Jacobs) (electronic
version on free download)
http://www.cs.vu.nl/~dick/PTAPG.html

Dick Grune 'compilers' book has a chapter on attribute grammars

- attribute grammars

FNC2 (strongly non circular attribute grammars)
http://www-rocq.inria.fr/oscar/www/fnc2/

An attribute grammar system in Haskell
http://www.cs.uu.nl/groups/ST/twiki/bin/view/Center/AttributeGrammarSystem

- Yacc like tools

Jacke, Yacc-like parser generator with SML-consistent sytax
http://www.ps.uni-sb.de/~jan/jacke/jacke.html


        Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners