Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test case crashes ocamllex #8026

Closed
vicuna opened this issue Feb 22, 2003 · 2 comments
Closed

Test case crashes ocamllex #8026

vicuna opened this issue Feb 22, 2003 · 2 comments
Labels

Comments

@vicuna
Copy link

vicuna commented Feb 22, 2003

Original bug ID: 1554
Reporter: administrator
Status: closed
Resolution: fixed
Priority: normal
Severity: minor
Category: ~DO NOT USE (was: OCaml general)

Bug description

The shell dialogue below shows that there is an input file that
crashes ocamllex. The ocamllex version in question was built from
fresh CVS sources this morning.

The shell dialogue also shows that in this case, ocamllex crashes when
it tries to compute a stack trace.

Then we go on to try ocamllex from the ocaml 3.06 distribution. It
works fine.

Then we show that the same problem was present with a version built
from November 2002 CVS sources, but a stack trace can be produced. I
wrote down some thoughts about the stack trace after the shell dialogue.

lobus:/huge/tim/unhacked-ocaml-anon-cvs> printenv OCAMLRUNPARAM
-b

Current CVS fails.<<<
lobus:/huge/tim/unhacked-ocaml-anon-cvs/lex> ./ocamllex ~/s/conscious/ocaml/lex.mll
Fatal error: exception Invalid_argument("Array.get") >>>Bad.<<<
Segmentation fault >>>Crashing when computing a stack trace. Bad.<<<
Here's the input file that breaks it.<<<
lobus:/huge/tim/unhacked-ocaml-anon-cvs/lex> cat ~/s/conscious/ocaml/lex.mll
{
module L = Lexeme
module S = Stream
}
(* eager_lexemes lexbuf returns an eager stream of lexemes. )
rule eager_lexemes = parse
'(' { S.icons L.LPAREN (lexemes lexbuf) }
| eof { S.sempty }
(
lexemes lexbuf returns a lazy stream of lexemes. )
and lexemes = parse
(
empty *) { S.slazy (fun () -> eager_lexemes lexbuf) }
This file works fine with ocamllex from 3.06 from Debian.<<<
lobus:/huge/tim/unhacked-ocaml-anon-cvs/lex> /usr/bin/ocamllex ~/s/conscious/ocaml/lex.mll
4 states, 257 transitions, table size 1052 bytes
lobus:/huge/tim/unhacked-ocaml-anon-cvs/lex> dpkg -S ocamllex
ocaml: /usr/bin/ocamllex
ocaml: /usr/share/man/man1/ocamllex.1.gz
ocaml: /usr/share/man/man1/ocamllex.opt.1.gz
ocaml: /usr/lib/ocaml/3.06/camlp4/pa_ocamllex.cma
lobus:/huge/tim/unhacked-ocaml-anon-cvs/lex> dpkg -l ocaml
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name Version Description
+++-==============-==============-============================================
ii ocaml 3.06-15 ML language implementation with a class-base
The November 2002 version fails the same way as the current CVS,
except you get a good stacktrace.<<<
lobus:/huge/tim/unhacked-ocaml-anon-cvs> ~/s/lex/ocamllex ~/s/conscious/ocaml/lex.mll
Fatal error: exception Invalid_argument("Array.get")
Raised from a C function
Called from file "lexgen.ml", line 1071, character 38
Called from file "lexgen.ml", line 947, character 18
Called from file "lexgen.ml", line 1151, character 67
Called from file "list.ml", line 57, character 23
Called from file "list.ml", line 57, character 39
Called from file "lexgen.ml", line 1158, character 16
Called from file "main.ml", line 53, character 65
Re-raised at file "main.ml", line 96, character 17
Called from file "main.ml", line 100, character 36
lobus:/huge/tim/unhacked-ocaml-anon-cvs>

Here are some lines from my copy of the November, 2002 version of
lexgen.ml, with a "^" inserted below character 38 of line 1071:

let translate_state shortest_match tags chars follow st =
let (n,(_,m)) = st.final in
if MemMap.empty = st.others then
Perform (n,do_tag_actions n tags m)
^
else if shortest_match then begin
if n=no_action then
Shift (No_remember,reachs chars follow st.others)
else
Perform(n, do_tag_actions n tags m)
end else begin
Shift (
(if n = no_action then
No_remember
else
Remember (n,do_tag_actions n tags m)),
reachs chars follow st.others)
end

One hypothesis is that the stack trace is garbage, since it's
complaining that Array.get is returning an out-of-bounds result but it
gives a pointer to a place that is not a call to Array.get.

Another hypothesis is that the problem happened in the array reference
to env.(n) at the end of do_tag_actions, and the compiler decided to
inline do_tag_actions, and the code that generates stack traces isn't
kind enough to insert stack frames for inlined subroutines.

The problem may have been caused by me putting an empty regexp in the
lex.mll file. This is a sensible thing to do, but the documentation
for ocamllex says it is illegal. Nevertheless it should not crash
ocamllex.

--
Tim Freeman
tim@fungible.com
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D 7180 76DF FE00 34B1 5C78

@vicuna
Copy link
Author

vicuna commented Feb 24, 2003

Comment author: administrator

The shell dialogue below shows that there is an input file that
crashes ocamllex. The ocamllex version in question was built from
fresh CVS sources this morning.

There are two bugs here, first the backtrace one, which I will not
discuss, then the ocamllex bug.

{
module L = Lexeme
module S = Stream
}
(* eager_lexemes lexbuf returns an eager stream of lexemes. )
rule eager_lexemes = parse
'(' { S.icons L.LPAREN (lexemes lexbuf) }
| eof { S.sempty }
(
lexemes lexbuf returns a lazy stream of lexemes. )
and lexemes = parse
(
empty *) { S.slazy (fun () -> eager_lexemes lexbuf) }

To my surprise this is a design/parsing bug.
Your code is indeed incorrect and should be rejected at parsing.

In fact your << lexemes >> rule is not well formed the correct code is
and lexemes = parse
""(* empty *) { S.slazy (fun () -> eager_lexemes lexbuf) }
^
Empty pattern !

<< \epsilon >> (ie nothing is not a valid ocamllex regular expression)
http://caml.inria.fr/ocaml/htmlman/manual026.html

At the moment the parser interpret your code as
eager_lexemes -> 2 rules
lexemes -> 0 rules (here is the parssing bug)
{ ... } -> Trailer

You can check the parser view of your code by swapping your two lexer
definitions
% cat a.mll
{
module L = Lexeme
module S = Stream
}
(* eager_lexemes lexbuf returns an eager stream of lexemes. )
rule lexeme = parse
(
empty ) { S.slazy (fun () -> eager_lexemes lexbuf) }
and eager_lexemes = parse
'(' { S.icons L.LPAREN (lexemes lexbuf) }
| eof { S.sempty }
(
lexemes lexbuf returns a lazy stream of lexemes. *)
% ./ocamllex a.mll
File "a.mll", line 8, character 1: syntax error.

Then the rest of the ocamllex is not prepared to a lexer definition
with no rules and it crashes.

As a conclusion you indeed discovered a bug (thank you for reporting
it) but your test code should not compile.

All the best,

--Luc

@vicuna
Copy link
Author

vicuna commented Feb 24, 2003

Comment author: administrator

ocamllex bug: Fixed by Luc on 2003-02-22
backtrace bug: simply triggered by [| |].(1);;
fixed by DD 2003-02-24

@vicuna vicuna closed this as completed Feb 24, 2003
@vicuna vicuna added the bug label Mar 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant