Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005579OCamlCamlp4public2012-04-09 13:292012-04-11 09:45
Reportergasche 
Assigned Todim 
PrioritynormalSeverityminorReproducibilityalways
StatusresolvedResolutionopen 
PlatformOSOS Version
Product Version4.00.0+dev 
Target VersionFixed in Version4.00.0+dev 
Summary0005579: When a plugin is loaded in the toplevel, Token.Filter.define_filter has no effect before the first syntax error
DescriptionAs a reduced test case, I have written an example filter that just prints the token stream. The filter is not active at first, and the first erroneous phrase activates it. Observe:

$ ocaml dynlink.cma camlp4o.cma
        Objective Caml version 3.12.1

    Camlp4 Parsing version 3.12.1

# #load "test_plugin.cmo";;
test case loaded
# let () = ();;
# let () = ;;
Error: Parse error: [expr] expected after "=" (in [cvalue_binding])
# let () = ();;
KEYWORD "let"
BLANKS " "
KEYWORD "("
KEYWORD ")"
BLANKS " "
KEYWORD "="
BLANKS " "
KEYWORD "("
KEYWORD ")"
KEYWORD ";;"
#

This behavior does not happen when invoking directly with `ocamlc dynlink.cma camlp4o.cma test_plugin.cmo`.

"test case loaded" is a debug statement that is executed when the plugin loads.
Steps To Reproduce(*
  ocamlc -pp camlp4r -I +camlp4 -c test_plugin.ml
*)

open Camlp4;

module Id : Sig.Id = struct
  value name = "filter bug test case";
  value version = "0.1";
end;

module Make (Syntax : Sig.Camlp4Syntax) = struct
  open Sig;
  include Syntax;

  value rec debug_filter = parser
  [ [: `tok; rest :] ->
    let () = prerr_endline (Token.to_string (fst tok)) in
    [: `tok; debug_filter rest :] ];

  value () = Token.Filter.define_filter (Gram.get_filter ())
    (fun filter stream -> filter (debug_filter stream));

  value () = prerr_endline "test case loaded";
end;

let module M = Register.OCamlSyntaxExtension(Id)(Make) in ();
TagsNo tags attached.
Attached Files

- Relationships
related to 0004811resolveddim Ast filters are not applied in the toplevel 

-  Notes
(0007301)
gasche (developer)
2012-04-09 13:37

Note: I have reproduced the bug under 3.11.2, 3.12.1 and trunk.
(0007302)
dim (developer)
2012-04-09 20:02

The reason is that the camlp4 toplevel module reuse the same filtered token stream for the same lexing buffer in Toploop.parse_toplevel_phrase.

I think it is safe to restart the parsing from scratch at each invocation. And it seems even wrong not to do it since Toploop discard everything after the first phrase.
(0007320)
dim (developer)
2012-04-10 23:18

I made the change.

Commits 12336 and 12337.
(0007323)
gasche (developer)
2012-04-11 06:09

Thanks for the quick bugfix!

Looking at the patch, I'm wondering why there was a stream caching technique in place. Does this change affect the performances much (I don't know where would toplevel parsing be performance-sensitive; maybe in HOL-light, which uses the OCaml toplevel as its workspace?)?

If that appeared to be a problem, you could restore the caching strategy, using something along the lines of (untested)

  let get_and_cache_stream lb =
    let not_filtered_token_stream = Lexer.from_lexbuf lb in
    let token_stream = Gram.filter (not_filtered not_filtered_token_stream) in
    let cached_stream = (token_stream, Gram.get_filter ()) in
    do { token_streams.val := [ (lb,cached_stream) :: token_streams.val ];
         token_stream } in
  match lookup lb token_streams.val with
  [ None -> get_and_cache_stream lb
  | Some (token_stream, filter) ->
    (* the cached filtered stream is only valid if the filter did not
       change; we conservatively check for physical equality *)
    if filter == Gram.get_filter () then token_stream
    else do {
      cleanup lb;
      get_and_cache_stream lb
    } ]
(0007324)
dim (developer)
2012-04-11 09:45

> Looking at the patch, I'm wondering why there was a stream caching technique in place. Does this change affect the performances much (I don't know where would toplevel parsing be performance-sensitive; maybe in HOL-light, which uses the OCaml toplevel as its workspace?)?

The difference in term of performances is very minimal (creation of a camlp4 lexer + creation of a stream), so nothing perceptible. I looked in the history: in the old camlp4 (and in the current camlp5) there was no caching. The new camlp4 used only one stream (stored in a ref), leading to 0004495 and 0004593, the fix was to cache streams this way.

Also not caching token streams fixes this kind of bugs:

# 1
  ^CInterrupted.
# 1;;
Characters 0-1:
  1;;
  ^
Error: This expression is not a function; it cannot be applied

- Issue History
Date Modified Username Field Change
2012-04-09 13:29 gasche New Issue
2012-04-09 13:37 gasche Note Added: 0007301
2012-04-09 19:48 dim Relationship added related to 0004811
2012-04-09 20:02 dim Note Added: 0007302
2012-04-09 20:04 dim Assigned To => dim
2012-04-09 20:04 dim Reproducibility have not tried => always
2012-04-09 20:04 dim Status new => assigned
2012-04-10 23:18 dim Note Added: 0007320
2012-04-10 23:19 dim Status assigned => resolved
2012-04-10 23:19 dim Product Version 4.01.0+dev => 4.00.0+dev
2012-04-10 23:19 dim Fixed in Version => 4.00.0+dev
2012-04-11 06:09 gasche Note Added: 0007323
2012-04-11 09:45 dim Note Added: 0007324


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker