Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Query: email parser in ocamllex/ocamlyacc
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2002-10-22 (22:31)
From: Gerd Stolpmann <info@g...>
Subject: Re: [Caml-list] Query: email parser in ocamllex/ocamlyacc

Am 2002.10.22 23:16 schrieb(en) Pierre Weis:
> > Version Francaise a la fin
> > ------------------------------
> > 
> > Hello,
> > 
> > I'm writting an ocamllex/ocamlyacc based application that extracts a <string
> > list> of emails embedded in a text/html file.
> > Would anyone of you know of any available implementation I could get
> > inspiration from (and save some time!).
> Really precise parsing of email messages requires implementing the
> RFC822 (more precisely RFC2822 nowadays), which is not a trivial
> task. I started to do it but gave up due to the absence of a scanf
> facility. I launched a thread to implement scanf, and 5 years after I
> understood how to do it in the Caml system!
> Now that we have scanf, I could go on to implement RFC(2)822.
> But don't hold your breath: if you don't need a full parser for mail
> messages the simpler way is to write a (false but trivial)
> approximation with a lexer...
> There may be such a program into Xaviers's spamoracle ?

Well, O'caml programming is so much fun that everybody wants to
reinvent the wheel. I really understand that, I'm also tempted
every day.

My wheel came into the world in the spring of 2000, and has grown
since that a lot. It is now called "ocamlnet" after the fusion
with Patrick Doane's wheel, and includes not only a parser for RFC(2)822 
messages, but supports also the MIME RFCs (2045-47), RFC 2231, 
parsing of dates, the ability to parse from pipelines chunk by 
chunk, and last but not least even printers for these (partly 
brain-dead) formats. You also find an HTML parser, and a lot of
other useful stuff. It is now more a mobile construction set than
a wheel.

By the way: if anybody has something to contribute, any addition
that is useful, works, and will be maintained is still accepted.

You find it here:
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: