English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    
Browse thread
Incremental, undoable parsing in OCaml as the general parser inversion
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: oleg@p...
Subject: Re: [Caml-list] Incremental, undoable parsing in OCaml as the general parser inversion

Paul Snively wrote:
> I've attempted to rewrite this in terms of your  
> monadic version of delimited continuations, as included in the  
> pa_monad syntax extension distribution.

I'm afraid that cannot be done. That is, short of re-writing
make_lexer and all other library functions in the monadic style. 

That is the trouble with the monadic style, well articulated by Greg
Morrisett in his invited talk at the ML Workshop 2005. In Haskell, we
essentially need two versions of every function. For example, we need
a pure map of the type (a->b) -> [a] -> [b] and the monadic version
mapM, of the type (a->m b) -> [a] -> m [b]. The claim goes that once
we went into trouble of writing a monadic version, we can instantiate
it with any monad and so `extend' it arbitrarily. For one thing, not
arbitrarily: the monadic transformer approach is of limited
expressiveness and there are practically significant computations that
cannot be expressed with monad transformers because they impose the
fixed order of layers. Mainly, monadic meta-language is of limited
expressiveness itself and is just plain ugly. One can't write guards
with monadic expressions (for a good reason), a lot of other
conveniences of Haskell become unavailable.  Therefore, people quite
often resort to unsafePerformIO -- even in reputable projects like
Lava. That is really as unsound as it sounds: one has to specifically
disable compiler optimizations (inlining); so-called lazy IO can lead
to problems like reading from a file after it was closed
and so one must _deeply_ force expressions and enforce evaluation
order, etc. I guess it is clear how much burden monadic notation
really is if even the high priests and most faithful of Haskell would
rather commit cardinal sins and forsake the static benefits of Haskell
-- only to avoid programming extensively in monadic style.

This reminds me of a funny event at the Haskell workshop 2006. One
participant stood up and sincerely proposed that Haskell' standard
would find a way to automatically derive a monadic version of a pure
expression. Chung-chieh Shan recommended that person to take a look at
OCaml... 

The beauty of delimited continuations is that we take any function
that accepts some kind of `callback' and pass a specifically
constructed callback (which is essentially shift f f). Thus we have
accomplished inversion. The function in question can be a map-like
traversal combinator (the callback is the function to map), a parser
(the callback here is the stream, the function that provides data to
the parser), a GUI system with its own event loop, etc.

An analogy with a debugger might make it easy to understand why it
all works. The operation (shift f f) is akin to the break point to the
debugger (INT 3 aka CC for x86 aficionados). When we pass this break
point as a callback to parse, for example, we drop into the debugger
whenever parse invokes our callback, that is, whenever it needs a new
character. Once in the debugger, we can examine our state and resume
the parser passing it the character of our choice. The form `reset'
(aka push_prompt) makes it possible for us to `script the debugger' in
our own program. Thus _delimited_ continuations offer us a
scriptable debugger. Unlike the free-wheeling debugger like gdb, the
debugger of delimited continuations respects abstractions and is
type sound, supporting static invariants of the type system.

John Skaller wrote:
> If i have an Ocamlyacc parser function f .. how do I control invert
> it?
That parser takes stream, which can be (made of) a function. We merely 
need to create a stream (lexbuf)

  type stream_req = ReqDone | ReqBuf of string * int * (int -> stream_req);;

  Lexing.from_function (fun buffer n -> 
		shift p (fun sk -> ReqBuf (buffer,n,sk)))

and pass the that stream to the parser. We need to write something
like the filler in the previous message, the handler for ReqBuf
requests. The handler will fill the buffer (the first argument of the
ReqBuf request) and invoke 'sk' passing it the number of placed
characters.


I would be remiss if I don't mention one of the main benefits of the
monadic style: monadic style converts control flow into data
flow. Therefore, types (which normally enforce data-flow
invariants like prohibiting the addition of an integer and a string)
can be used to _statically_ enforce invariants of control flow, e.g.,
all accessible file handles are open, all acquired locks are released,
no blocking operation is attempted while a lock is held, etc. We have
many working examples of statically enforcing such properties. Without
monadic style, we need the dual of type systems -- effect systems --
to provide similar static guarantees. Alas, despite Filinski's
prescient call for more attention to effect systems 13 years ago,
hardly any effect system is available in a mainstream functional
language.