Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006497OCamlOCaml generalpublic2014-07-25 07:232014-08-18 21:57
Reporterwhitequark 
Assigned Tofrisch 
PrioritynormalSeverityminorReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.02.0+beta1 / +rc1 
Target Version4.02.0+devFixed in Version 
Summary0006497: Access to Clflags from ppx preprocessors
DescriptionI have ran several times into a need to access Clflags, which is obviously impossible, as the ppx preprocessor runs in a different process.

These include:

  * Getting the name of the current module, which is critical to 2 of 4 of the extensions I currently have. I'm currently using Location.input_name and deriving the module name from the file basename, but that would break with -for-pack.

  * Getting the list of -I flags. I have several extensions in mind that would do something with Typedtree of other modules, but it would currently be impractical to deploy them.

  * Getting the state of -g flag, to generate information only useful together with backtraces.

  * Getting the kind of compiler (ocamlc/ocamlnat), for conditional compilation.

So I think the Clflags.(for_pack, include_dirs, debug, native_code) should be exposed to ppx like Location.input_name currently is. I consider the lack of Clflags.for_pack a bug; it would be great to have it fixed in 4.02.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0011909)
whitequark (developer)
2014-07-25 07:24

I forgot one thing in description; I do not think exposing any other Clflags fields makes any sense. Those two should satisfy any needs a preprocessor may have.
(0011910)
frisch (developer)
2014-07-25 09:30

Indeed, it could make sense to pass more contextual information from the caller (which can be ocamlc, ocamlopt, but also ocaml, ocamldoc, ocamldep, etc). Extra information could be passed in several ways:

 - on the ppx command-line: the API already supports passing extra parameters to ppx (in addition to the input/output filenames), and we would just need to tweak the compilers to pass those informations and existing ppx tools to ignore them. "ppx drivers" (which orchestrate several actual ppx mappers) would need to pass all such argument along.

 - as extra @@@attributes inserted in front of the AST by the compiler (and maybe directly interpreted by the Ast_mapper driver so that ppx have easily access to the information). This might be more robust than command-line arguments (length limitation under Windows, escaping issues, easier to pass structured data if needed and to ignore irrelevant attributes, automatic passing between nested ppx).

 - by changing the binary protocol around ppx, i.e. change the type of the marshaled data to include an extra record holding all the relevant values. A downside is that currently, the exact same format (and magic number) is also used by -pp (i.e. camlp4/camlp5 in practice), which makes it possible, for instance, to send the binary output of camlp4 directly to a ppx, on the command-line (without going through a compiler).

Do you have an opinion?

Concerning the actual set of information to pass, I'm wondering whether we should define exactly which options are passed, or whether we should pass the entire command-line (ppx would need to reparse it, taking OCAMLPARAMS into account, but we would automatically support all current and future flags).
(0011911)
whitequark (developer)
2014-07-25 09:56

Naturally, I would first reach for changing the binary protocol. It's actually simple to keep the ability to pipe camlp4 to ppx, you just need to match on magic number.

I think command-line is not a good idea for the reasons you have listed. I do not have a preference between changing binary protocol and @@@attributes, except for the fact that I really don't like the complexity of the serializer/deserializer associated with @@@attributes (but it could be hidden in the compiler).

I think parsing command-line should be done in exactly one place for consistency. Additionally, OCaml tools frequently have differents sets of valid arguments and it is not clear how Ast_mapper should deal with that; it would have to duplicate the option parsers of different tools or give up on "parsing all options" goal.

I don't see a point in passing all flags. I think the simplest solution is to change the protocol and magic number, and pass a record with the four flags I've mentioned, but there are certain advantages in passing that record via @@@attributes.
(0011912)
frisch (developer)
2014-07-25 10:11

If we don't pass the full command-line, we should at least send the tool name (either as a concrete sum type or as a string to be more extensible) to support the toplevel, ocamldoc, ocamldep in addition to ocamlc/ocamlopt.

I think that @@@attributes are more coherent with how we pass warnings/errors back from ppx to the compiler. Moreover, they would not require to change the magic number if we ever want to pass more information (e.g. from tools such as ocamldoc, or maybe merlin).

Do you want to give it a try?
(0011913)
whitequark (developer)
2014-07-25 10:18

I think sending tool name as a string is OK. In @@@attributes there isn't a choice anyway.

I think @@@attributes are worth a try. Would you write a patch, or should I?
(0011914)
whitequark (developer)
2014-07-25 10:20

Actually, if tool name is sent, there is no need to send native_code, as it is just [tool_name = "ocamlopt"].
(0011915)
frisch (developer)
2014-07-25 11:00

> Would you write a patch, or should I?

I won't have time to work on it in the next days, so feel free to hack happily and provide a patch if we wish!
(0011916)
frisch (developer)
2014-07-25 16:33

Should we have a single attributes, e.g.:

 [@@@ocaml.ppx.context {tool_name="ocamlc"; load_path=["...";...;"..."]; for_pack=None; debug=false}]

or rather multiple attributes:

 [@@@ocaml.ppx.tool_name "ocamlc"]
 [@@@ocaml.ppx.load_path ["...";...;"..."]]
 [@@@ocaml.ppx.for_pack None] (* or allow to omit in case default is used *)
 [@@@ocaml.ppx.debug false] (* idem *)


I don't have a strong preference.
(0011917)
whitequark (developer)
2014-07-25 16:35

I prefer single attributes. Easier to skip, less noise in debugging output, etc.

Where do we export tool_name? Ast_mapper?
(0011918)
frisch (developer)
2014-07-25 16:40

> Where do we export tool_name? Ast_mapper?

Yes, this makes it clear that the information is for ppx. I assume you have a internal reference in mind, with an accessor function, and the reference (together with other references in Clflags) would be set by Ast_mapper.apply by parsing the top of the AST to find the discussed attribute?
(0011919)
whitequark (developer)
2014-07-25 16:41

Yes, it is simple and consistent with existing interfaces.
(0011920)
whitequark (developer)
2014-07-25 16:42

What to do if Ast_mapper ~somehow~ receives a malformed ppx.context attribute? I like any solution that does not silently ignore that.
(0011921)
frisch (developer)
2014-07-25 16:44

What about having the ppx report an error (or a warning?) in the resulting AST?
(0011922)
whitequark (developer)
2014-07-25 16:47

Do you think replacing the @@@ocaml.ppx.context with @@@ocaml.error is appropriate?
(0011924)
frisch (developer)
2014-07-25 16:48

Probably rather with [%%ocaml.error] (i.e. the extension node recognized by the compiler).
(0011925)
whitequark (developer)
2014-07-25 16:49

Right, I meant [%%ocaml.error].
(0011926)
whitequark (developer)
2014-07-26 09:24

I just realized another (related) problem, there is no entry point to read Location.input_name from. You can't do it as:

   let mapper argv =
      foo := Location.input_name; (* foo = "_no" *)
      { default_mapper with ... }

and you can't do it in mapper.structure, because that can be called several times while nested.

Use case: I want to keep track of full module path for types in ppx_deriving. It's probably possible by adding more global state, but meh. Also, that'll break if ppx drivers will reuse whatever is returned by the mapper generator function applied to command-line options.

Any chance Ast_mapper.mapper will gather a `toplevel_structure` or a similar field, and signature counterparts?
(0011931)
frisch (developer)
2014-07-28 11:33

It would be possible to arrange so that Location.input_name is set before the function creating the mapper (from command-line arguments) is executed.

Concerning the "toplevel_structure", the current API can be used to have a specific behavior at toplevel:

 let mapper argv =
   let super = default_mapper in
   let structure m str = (* ... *) in
   let mapper = {super with structure} in
   {mapper with structure = (fun _ str ->
         (* do something special at toplevel, but recurse with mapper *)
         mapper.structure mapper str)
   }
(0011950)
whitequark (developer)
2014-07-30 22:58

@doligez: I'm currently working on this. Expect a PR in a few days.
(0011960)
whitequark (developer)
2014-07-31 18:43

@Alain: how would we *output* the tool_name field? I'm currently thinking of setting the Ast_mapper.tool_name reference in the corresponding drivers. Don't see a cleaner solution.
(0011963)
whitequark (developer)
2014-07-31 21:00

https://github.com/ocaml/ocaml/pull/85 [^]

Test code:

    open Ast_mapper
    open Parsetree

    let print_env () =
      Printf.printf "\ntool_name: %s\n" !Ast_mapper.tool_name;
      Printf.printf "include_dirs: %s\n" ([%derive.Show: string list] !Clflags.include_dirs);
      Printf.printf "open_module: %s\n" ([%derive.Show: string list] !Clflags.open_module);
      Printf.printf "for_package: %s\n" ([%derive.Show: string option] !Clflags.for_package);
      Printf.printf "debug: %B\n" !Clflags.debug

    let () =
      Ast_mapper.register "t" (fun argv ->
        { default_mapper with
          structure = (fun mapper items ->
            Pprintast.structure Format.std_formatter items; print_env (); items) })

How to build:

    ocamlfind c -package ppx_deriving -package compiler-libs.common -linkpkg testmap.ml -o testmap

How to test:

    $ ocamlfind c -g -for-pack Foo -package num -package unix -package ppx_tools -open Big_int -open Unix -ppx ./testmap foo.ml
    [@@@ocaml.ppx.context
      {
        tool_name = "ocamlc";
        include_dirs =
          ["/home/whitequark/.opam/local/lib/ppx_tools";
          "/home/whitequark/.opam/local/lib/ocaml/compiler-libs";
          "/home/whitequark/.opam/local/lib/num"];
        open_module = ["Unix"; "Big_int"];
        for_package = (Some "Foo");
        debug = true
      }]
    tool_name: ocamlc
    include_dirs: ["/home/whitequark/.opam/local/lib/ppx_tools"; "/home/whitequark/.opam/local/lib/ocaml/compiler-libs"; "/home/whitequark/.opam/local/lib/num"]
    open_module: ["Unix"; "Big_int"]
    for_package: Some ("Foo")
    debug: true
(0011985)
frisch (developer)
2014-08-07 10:55

Thanks! I've committed the patch to 4.02 (commit 15062), with some modifications (mostly cosmetics, except for tool_name which is now passed explicitly by the calling tool when it applies ppx processors instead of setting the global reference).
(0011986)
frisch (developer)
2014-08-07 11:29

Commit 15064 fixes the toplevel. It was broken because it received a typedtree containing the extra context attribute when it was expecting a simple Tstr_eval. The attribute is now removed in the same function which adds it (apply_rewriters) and it never flows to the type-checker.
(0011987)
frisch (developer)
2014-08-07 11:47

Also committed to trunk (15068).
(0011988)
whitequark (developer)
2014-08-07 12:06

Excellent, thanks! 15064 would also fix utop (regular toplevel seemed to work for me, but perhaps I am mistaken.)

- Issue History
Date Modified Username Field Change
2014-07-25 07:23 whitequark New Issue
2014-07-25 07:24 whitequark Note Added: 0011909
2014-07-25 09:30 frisch Note Added: 0011910
2014-07-25 09:56 whitequark Note Added: 0011911
2014-07-25 10:11 frisch Note Added: 0011912
2014-07-25 10:18 whitequark Note Added: 0011913
2014-07-25 10:20 whitequark Note Added: 0011914
2014-07-25 11:00 frisch Note Added: 0011915
2014-07-25 16:33 frisch Note Added: 0011916
2014-07-25 16:35 whitequark Note Added: 0011917
2014-07-25 16:40 frisch Note Added: 0011918
2014-07-25 16:41 whitequark Note Added: 0011919
2014-07-25 16:42 whitequark Note Added: 0011920
2014-07-25 16:44 frisch Note Added: 0011921
2014-07-25 16:47 whitequark Note Added: 0011922
2014-07-25 16:48 frisch Note Added: 0011924
2014-07-25 16:49 whitequark Note Added: 0011925
2014-07-26 09:24 whitequark Note Added: 0011926
2014-07-28 11:33 frisch Note Added: 0011931
2014-07-30 22:54 doligez Status new => feedback
2014-07-30 22:54 doligez Target Version => 4.02.0+dev
2014-07-30 22:58 whitequark Note Added: 0011950
2014-07-30 22:58 whitequark Status feedback => new
2014-07-31 18:43 whitequark Note Added: 0011960
2014-07-31 21:00 whitequark Note Added: 0011963
2014-08-07 10:55 frisch Note Added: 0011985
2014-08-07 11:27 frisch Assigned To => frisch
2014-08-07 11:27 frisch Status new => assigned
2014-08-07 11:29 frisch Note Added: 0011986
2014-08-07 11:47 frisch Note Added: 0011987
2014-08-07 12:06 whitequark Note Added: 0011988
2014-08-18 21:57 frisch Status assigned => resolved
2014-08-18 21:57 frisch Resolution open => fixed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker