Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] ocamllex problem
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: james woodyatt <jhw@w...>
Subject: Re: [Caml-list] ocamllex problem
On 04 Aug 2005, at 18:03, skaller wrote:
>
> Alain Frisch pointed me at some nasty papers on this, one with a  
> regexp -> NFA conversion and the other with a NFA-> DFA conversion,  
> but I couldn't figure out how to do the direct regexp->DFA  
> conversion, I'd sure like to find an algorithm for that..

In my OCaml NAE Core Foundation, there is a something you may find  
interesting.  See the [Cf_lex] module and its subordinate [Cf_dfa].   
Since it isn't trying to be a multi-stage programming tool like  
[ocamllex], it produces a parser monad that executes a Lazy-DFA,  
instead of a fully space-time optimized DFA.  At some point, I may  
implement a [study] function that fully evaluates the Lazy-DFA and  
optimizes it, but I don't yet see a compelling need for that.

One thing: the pattern [':'((letter|' ')* as s)] is interesting.   
You're definitely right that something non-trivial is happening  
inside the DFA.  My [Cf_dfa] module does not keep a stack of  
backtracking sequences because I did something else to resolve the  
problem.  Look at the ( $@ ) operators, which allow you to use a  
parser monad on the recognized input sequence to obtain the result of  
a lexical rule.  Using this, you can implement something like the  
feature you're interested in by defining a nested hierarchy of parsers.

I know.  This is probably not what you're looking for.  To get what  
you're looking for, I'd have to extend [Cf_dfa] to handle marker  
nodes in the NFA.  I thought that would be more appropriate for  
[ocamllex] and similar tools, so I didn't do it.  Nice to see  
[ocamllex] did.


-- 
j h woodyatt <jhw@wetware.com>
that's my village calling... no doubt, they want their idiot back.