Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When a plugin is loaded in the toplevel, Token.Filter.define_filter has no effect before the first syntax error #5579

Closed
vicuna opened this issue Apr 9, 2012 · 5 comments

Comments

@vicuna
Copy link

vicuna commented Apr 9, 2012

Original bug ID: 5579
Reporter: @gasche
Assigned to: @diml
Status: closed (set by @xavierleroy on 2015-12-11T18:04:31Z)
Resolution: open
Priority: normal
Severity: minor
Version: 4.00.0+dev
Fixed in version: 4.00.0+dev
Category: -for Camlp4 use https://github.com/ocaml/camlp4/issues
Related to: #4811
Monitored by: @diml @hcarty

Bug description

As a reduced test case, I have written an example filter that just prints the token stream. The filter is not active at first, and the first erroneous phrase activates it. Observe:

$ ocaml dynlink.cma camlp4o.cma
Objective Caml version 3.12.1

Camlp4 Parsing version 3.12.1

#load "test_plugin.cmo";;

test case loaded

let () = ();;

let () = ;;

Error: Parse error: [expr] expected after "=" (in [cvalue_binding])

let () = ();;

KEYWORD "let"
BLANKS " "
KEYWORD "("
KEYWORD ")"
BLANKS " "
KEYWORD "="
BLANKS " "
KEYWORD "("
KEYWORD ")"
KEYWORD ";;"

This behavior does not happen when invoking directly with ocamlc dynlink.cma camlp4o.cma test_plugin.cmo.

"test case loaded" is a debug statement that is executed when the plugin loads.

Steps to reproduce

(*
ocamlc -pp camlp4r -I +camlp4 -c test_plugin.ml
*)

open Camlp4;

module Id : Sig.Id = struct
value name = "filter bug test case";
value version = "0.1";
end;

module Make (Syntax : Sig.Camlp4Syntax) = struct
open Sig;
include Syntax;

value rec debug_filter = parser
[ [: tok; rest :] -> let () = prerr_endline (Token.to_string (fst tok)) in [: tok; debug_filter rest :] ];

value () = Token.Filter.define_filter (Gram.get_filter ())
(fun filter stream -> filter (debug_filter stream));

value () = prerr_endline "test case loaded";
end;

let module M = Register.OCamlSyntaxExtension(Id)(Make) in ();

@vicuna
Copy link
Author

vicuna commented Apr 9, 2012

Comment author: @gasche

Note: I have reproduced the bug under 3.11.2, 3.12.1 and trunk.

@vicuna
Copy link
Author

vicuna commented Apr 9, 2012

Comment author: @diml

The reason is that the camlp4 toplevel module reuse the same filtered token stream for the same lexing buffer in Toploop.parse_toplevel_phrase.

I think it is safe to restart the parsing from scratch at each invocation. And it seems even wrong not to do it since Toploop discard everything after the first phrase.

@vicuna
Copy link
Author

vicuna commented Apr 10, 2012

Comment author: @diml

I made the change.

Commits 12336 and 12337.

@vicuna
Copy link
Author

vicuna commented Apr 11, 2012

Comment author: @gasche

Thanks for the quick bugfix!

Looking at the patch, I'm wondering why there was a stream caching technique in place. Does this change affect the performances much (I don't know where would toplevel parsing be performance-sensitive; maybe in HOL-light, which uses the OCaml toplevel as its workspace?)?

If that appeared to be a problem, you could restore the caching strategy, using something along the lines of (untested)

let get_and_cache_stream lb =
let not_filtered_token_stream = Lexer.from_lexbuf lb in
let token_stream = Gram.filter (not_filtered not_filtered_token_stream) in
let cached_stream = (token_stream, Gram.get_filter ()) in
do { token_streams.val := [ (lb,cached_stream) :: token_streams.val ];
token_stream } in
match lookup lb token_streams.val with
[ None -> get_and_cache_stream lb
| Some (token_stream, filter) ->
(* the cached filtered stream is only valid if the filter did not
change; we conservatively check for physical equality *)
if filter == Gram.get_filter () then token_stream
else do {
cleanup lb;
get_and_cache_stream lb
} ]

@vicuna
Copy link
Author

vicuna commented Apr 11, 2012

Comment author: @diml

Looking at the patch, I'm wondering why there was a stream caching technique in place. Does this change affect the performances much (I don't know where would toplevel parsing be performance-sensitive; maybe in HOL-light, which uses the OCaml toplevel as its workspace?)?

The difference in term of performances is very minimal (creation of a camlp4 lexer + creation of a stream), so nothing perceptible. I looked in the history: in the old camlp4 (and in the current camlp5) there was no caching. The new camlp4 used only one stream (stored in a ref), leading to #4495 and #4593, the fix was to cache streams this way.

Also not caching token streams fixes this kind of bugs:

1

^CInterrupted.

1;;

Characters 0-1:
1;;
^
Error: This expression is not a function; it cannot be applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant