Version française
Home     About     Download     Resources     Contact us    
Browse thread
Re: [Caml-list] Again on pattern matching and strings
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Luc Maranget <luc.maranget@i...>
Subject: Re: [Caml-list] Again on pattern matching and strings
> 
> 
> 
> Jacques Garrigue wrote:
> 
> > That message was about polymorphic variants, which are encoded as
> > integers, and for which pattern-matching is a decision tree.
> > 
> > However, if you look in bytecomp/matching.ml, you will see that string
> > patterns are just checked sequentially (the ordering is not used).
> > Moreover, the match compiler seems to be clever enough to compile
> > properly the above style:...
> 
> Very strange. I thought the Ocaml compiler sould 
> precalculate the branch of pattern matching to be taken, and 
> then jump, thereby avoiding sequential checking. I'm sorry 
> for my mistake.

If you are interested in pm code, I would suggest that you have a look
at the produced code after pattern-matching compilation (option -dlambda),
before looking at the compiler sources.

The issue is not really PM bu rather switches: how to compile
a serche in a ordered list of constants ?

To sum it up for strings : strings are atoms to the PM compiler which
never look into them, it only compares one string against another, for
equality only.  The match compiler does not make avantage of the known
pattern string in any sense.  The match compiler does not make
avantage of the existence of a lexical ordering on strings. In fact
many << optimizations >> are posible here, none is performed.


For other ``constants'' from others datatypes (that is at the compiler
level for machine integers) the switch compiler performs many optimizations.
Basically the compiler mixes tests againts constants
(= i < i, etc and a special x in [i1..i2] test) and
jump tables.

Strings hence remain ``the ghost in the machine'' as regards switch
efficiency.

All this can be explained by history and by the search of a compromise
between compiler complexity on the one hand, and code efficiency on
the other.


If you want efficient search in a set of strings, PM is not the
solution, a library solution is provided by Hastbl or Map.
More efficient solutions can be obtained by coding, or provided by
third party libraries.

As to your original problem, I cannot resist proposing  a quick and
dirty solution, using cpp, still having meaningful line numbers.

yourfile.ml:
#define S1 "...."
#define S2 "xxxx"

...

let f x = match x with
| S1 -> ...
| S2 -> ...


Makefile:
CPP=cpp -E -P
#Some alternatives...
#CPP=/lib/cpp -E -P  (Old fashioned Unix)
#CPP=gcc -E -P -x c  (If you have gcc)
 ....

yourfile.cmo: yourfile.ml
        ocamlc -pp '${CPP}' -c yourfile.m



Of course, this is quite dirty and the previous messages were giving
clues to much cleaner solutions. In particular, learning how to use
camlp4 is a worthy investment.



> Alex Baretta

--Luc Maranget
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners