Version française
Home     About     Download     Resources     Contact us    
Browse thread
Search for the smallest possible possible Ocaml segfault....
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: pierre.weis@i...
Subject: Re: [Caml-list] STOP (was: Search for the smallest possible possible Ocaml segfault)
Hello world,

> Segfaults in OCaml are seldom, but nevertheless
> those seldom seen segfaults should be fixed.
> 
> The original poster stated out that the bug he
> posted was four months on status "new".
> 
> This was a littlebid astonishing, and possibly
> the reason why this thread was started.

I think this is my fault: I implemented Scanf in the first place.

May be some of you missed the point in the segfault example involving Scanf
that was given on this list: the example involves using positional parameters
in the string argument passed to sscanf and using meta format specifications
in the format string argument.

Positional parameters:
----------------------

Positional parameters are parameters number specifications that allows format
strings to refer to another parameter than the next in the presentation
order. This is supposed to be useful for internationalization where you can
change the printing order of the parameters to reflect the translation.

This new feature has only been introduced in the documentation for Printf
just after the successful correction of the long standing strange behaviour
of printf with respect to partial evaluation. Positional parameters have
never been mentioned in the Scanf documentation and Scanf is not supposed
to supported them.

To say the least, this feature is still experimental and not yet completely
implemented in the current sources of the compiler. In fact, the typechecker
abruptely rejects format strings with positional parameters, as exemplify
here:

# Printf.printf "%2$i %1$s" "toto" 1;;
Bad conversion %$, at char number 0 in format string ``%2$i %1$s''

As mentioned above, Scanf is not supposed to handle positional parameters,
and indeed rejects them at runtime in the format string (this could seem
overkill, given that the type-checker rejects positional parameters in the
first place, but well, 2 checks are better than one!).

Meta format specifications:
---------------------------

However, Scanf is still capable to read a format string lexem given in the
input, provided the format used to read this lexem involves a meta format
that properly describes the format string lexem to be read. For instance:
Scanf.sscanf s "%{ %i %f %}" is supposed to read in the input (the string s)
a format string that specify to read first an integer, then a floating point
number.

A more practical working example could be:

# let fmt =
    Scanf.sscanf "\"Reference: %i      Price : %f\"" "%{%i%f%}"
      (fun fmt -> fmt);;
val fmt : (int -> float -> '_a, '_b, '_c, '_d, '_d, '_a) format6 = <abstr>
# string_of_format fmt;;
- : string = "Reference: %i      Price : %f"

This features allows a procedure to read the format string it has to use to
read a file: the procedure just reads the format as the first line of the
file.

The example seg-fault analysis:
-------------------------------

So, the seg-faulting example given is quite involved, since it uses scanf's
capability to read a format string in the input, in order to create a format
string with positional specification. This was clearly unexpected, given the
``axiom'' that says "the type checker will prevent that in the first place
since it rejects any positional parameter!". In this case, the typechecker
cannot reject the format string given in the program, since it has no
positional specification; and it cannot reject the format read, since this
format is unknown at compile time!

This simply means that there is a bug in the type compatibility runtime test
for format strings that fails to properly reject positional parameters. This
is not difficult to correct, if not at all satifactory.

The corrections or problem suppression:
---------------------------------------

I once thought that introducing positional parameters in Scanf, Printf, and
Format, would be a piece of cake in comparison to correcting the hard bug of
printf's treatment of partial evaluation (this ``misfeature'' stood there for
more than 10 years, before a correction can be figured out). Unfortunately, I
was wrong, positional parameters are not at all easy, even when the printf
behaviour is corrected: admittedly, the runtime implementation for positional
parameters was not too hard for Printf and it has been done quickly; on the
other hand, the type checking of format strings with positional parameters
proved to be untrivial: you need a deep breath to dig into the old code, you
need once more understand it in depth, and basically you must rewrite it with
a new logic that supports positional parameters. This has not yet been done,
unfortunately.

In conclusion, I have to correct the runtime compatibility check for formats
to suppress the problem, and remove any mention of positional parameters from
the documentation, until I achieve the new type checking stuff for format
strings.

Or just give up on this complex feature that has already upset too many people.

I agree with any one who cares that I was wrong to introduce a not yet
properly baked feature in Caml. The natural over optimistic tendancy of my
researcher's enthousiasm caught me there ...

[...]
> I for myself have only stated that I hope the bugs will be fixed.

I hope them to be fixed as well. Sometimes it's difficult. Sometimes we lack
the time and concentration to find the solution. Even worse, sometimes we
never find the solution for years...

Best regards,


PS: You can check in the bug tracking system that there are more than one
message concerning positional parameters in format strings, and that I
already answered to a lot of them.

-- 
Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/