Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007050OCaml~DO NOT USE (was: OCaml general)public2015-11-19 17:302017-03-02 11:51
Reportergasche 
Assigned Toadministrator 
PrioritynormalSeverityfeatureReproducibilityN/A
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target Version4.03.1+devFixed in Version4.05.0 +dev/beta1/beta2/beta3/rc1 
Summary0007050: An "-args <path>" option to pass command-line flags in a file
DescriptionThe Coq team had to disable native_compute on Windows because the "-I .." flags to be passed to the OCaml compiler would overflow Windows' command-line size limit.

We removed general @args-file handling from the OCaml runtime, but there is clear need for support for a similar size-limit-avoiding feature for the tools of the compiler distribution.
Additional InformationOne option would be to try to add this feature to the Args module in general. I'm fine with some sort of (Args.from_file ...) call that would help OCaml programmers implement this in their projects, but I would like to restrict the scope of the present PR to a compiler-tools-only option in the interest of minimality, and the hope of reaching trunk before the December feature freeze.
TagsNo tags attached.
Attached Files? file icon readwords.ml [^] (2,938 bytes) 2015-11-20 16:11 [Show Content]

- Relationships
parent of 0007011resolvedshinwell ocamldep: Argument list too long 
related to 0005937acknowledged Support for passing more parameters from external files 

-  Notes
(0014734)
frisch (developer)
2015-11-19 17:32
edited on: 2015-11-19 17:33

flexdll would also probably need something similar (and the compiler will have to use the feature when calling it).

A similar topic is https://github.com/alainfrisch/flexdll/issues/7 [^] (this is about calling external commands from flexdll, not its own CLI).

(0014736)
xleroy (administrator)
2015-11-20 13:50
edited on: 2015-11-20 13:52

Before we start coding like crazy, it would be good to agree on a format for the contents of response files. Several formats exist in the wild:

1- One argument = one line (without the terminating newline). Makes it trivial to have spaces in arguments.
1a- Variant: have an escape to represent newlines inside arguments (dubious)
1b- Variant: if there are null bytes in the file, split arguments on null bytes instead (for compatibility with find -print0).

2- One whitespace-delimited word = one argument. To handle space in arguments, we need a quoting mechanism:
2a- POSIX shell quoting (single quotes, double quotes, backslash escapes, all the work)
2b- Windows style quoting (creative use of double quotes and backslashes)
2c- Whatever is the inverse of Filename.quote on the current platform.

Each of these is a small matter of programming, but please let's agree on a spec first.

(0014737)
gasche (developer)
2015-11-20 14:04

I want the feature and am willing to be flexible on the details, so for me it's a "whatever the people with a strong opinion agree on".

Below are a few personal (mild) preferences.

a) I would avoid null bytes to make it easier for non-expert users to edit in a text editor etc. (so no 1b).

b) This is not a hard design goal, but it would be nice if we could have enough platform-independence to let users distribute args-file in their project and share them across systems (eg. "my preferred warning settings are to be found in mywarning.args, ready to use with -args").

c) On the other hand, it is important to be able to programmatically generate args-file, in particular from an OCaml program (this is the Coq use-case: I want to synthetize a call to the OCaml compiler with a very long list of include flags). For filesystem paths in particular, there must be a simple way to quote them from an OCaml program (preferably in the stdlib already) that is supported.

(Note that the use-cases for system-agnostic sharing of args-file (b) I can think of do not involve passing filesystem paths as options (which are not portable anyway), except maybe relative paths inside the project's source repository. Is there one relative path escaping convention that all Sys implementations support?)
(0014738)
xleroy (administrator)
2015-11-20 15:11

> Is there one relative path escaping convention that all Sys implementations support?

If the filename contains no backslashes nor double quotes (implying: on Windows you use forward slashes as directory delimiters), you can put double quotes around it and the result reads back identically with POSIX shell conventions and with Windows conventions. Otherwise, the two conventions differ fundamentally.
(0014739)
frisch (developer)
2015-11-20 15:16

I'd go for 1, which is the easiest one. What would be the practical interest of supporting newlines or null bytes in command lines arguments?
(0014740)
gasche (developer)
2015-11-20 15:25

I'm fine with 1 as it seems to support all my preferences (better: it requires no quoting, so file paths have an non-ambiguous meaning regardless of the OS).

Given that my feature request is only for the tools of the compiler distribution, and that none of those tools expect arbitrary text as argument, the unability to represent \n will not be a problem. If we someday decide to generalize the feature to, say, users of the Args module, we may regret that choice (but even then...), but I'd be happy with the simplest solution.

If someone is going to wince at the lack of generality of the approach, it's probably Damien.
(0014745)
xleroy (administrator)
2015-11-20 16:12

Attached is a proof-of-concept implementation of approach #2. It implements two flavors of double quoting, the POSIX-like one and the Win32-like one. I'm pretty sure that it correctly inverts the effect of Filename.quote on the respective platforms.
(0014747)
gasche (developer)
2015-11-20 16:37
edited on: 2015-11-20 16:52

(I removed the .unquote suggestion that was here, the signature and usage are too different.)

I'm worried people would find it overkill, but we could have three options, -args, -args-unix and -args-win, with the two later fixing an explicit convention and the first defaulting on the current OS. This solves both my needs (b) and (c). If I can't get my three options, we can also drop (b: portable use) in favor of (c: script-friendly from the same machine) by just having -args.

(0014748)
frisch (developer)
2015-11-20 16:41

Has anyone ever called the compiler or related tools with arguments containing newline characters? (I could imagine this might be the case for ocamldoc, for some textual arguments, but I strongly doubt it.) Or is this discussion only justified by a desire to be generic enough to support inclusion in Args?
(0014749)
gasche (developer)
2015-11-20 16:51

On a purely subjective level, I have a mild preference for telling users "just write in that file whatever you would write in the command-line", which is allowed by approach (2) (Xavier's patch) but not by approach 1 (they have to remember to use a very-simple-but-different syntax).
(0014750)
xleroy (administrator)
2015-11-20 17:53

Alain, don't obsess over newlines in filenames, it's anecdotal. But consider this. We need at least to handle file and directory paths that contain spaces, so some form of quoting is needed. Then, we should handle flags like -pp and -ppx, which take possibly complex commands as arguments, hence nested quotes can occur.

At this point, we can either go for shell-like syntax that users are already familiar with, or with more ad-hoc approaches like "one argument per line" that they will struggle with.

And, yes, I'm shooting for something generic enough to be put in module Args.
(0014760)
frisch (developer)
2015-11-20 18:44

> contain spaces, so some form of quoting is needed

Do you consider "one argument per line" as some form of quoting?

> "one argument per line" that they will struggle with.

What would be the difficulty? I expect these args to be most commonly generated (in which case the "one argument per line, no quoting" is rather straightforward), and rarely user-written (and even in that case, not having to use different quoting depending on the OS is beneficial to portability).

The direct correspondence between lines in the file and what goes into Sys.argv is nice and simple. This might be a developer-, not a user-centric perspective, but I tend to consider the need for quoting on the command-line as a superficial (and painful) detail of current shell syntax, not anything deeper. This is especially coherent with how arguments are passed to processes on Unix; much less so on Windows.
(0015818)
gasche (developer)
2016-04-19 20:48

I think we missed the mark on 4.03. I would still be interested in having the feature in.

If there are good justifications for supporting "like in the command-line" syntax and a "one argument per line", why not have both? We could have -args be the shell-quoted one, and -args1 be the one-per-line one (the 1 comes from the "ls -1" option, it may appear in other places).
(0015976)
schommer (reporter)
2016-06-07 16:36

I would be in favor of having some kind of response file support, since it seems that the gnumake for windows has an even more limited command line length and for example the dependency generation of CompCert seems to hit this limit.

I would suggest using the same convention as gcc does for @files:

"Read command-line options from file. The options read are inserted in place of the original @file option. If file does not exist, or cannot be read, then the option will be treated literally, and not removed.
Options in file are separated by whitespace. A whitespace character may be included in an option by surrounding the entire option in either single or double quotes. Any character (including a backslash) may be included by prefixing the character to be included with a backslash. The file may itself contain additional @file options; any such options will be processed recursively."
(0016084)
xleroy (administrator)
2016-07-19 16:15

Bernhard Schommer and I discussed this issue in another context (CompCert also needs response files...). I realized that my readword.ml sample implementation is not quite right for the Win32 case, but I know how to fix it.

The question remains: what syntax do we want for response files?
1- Something that resembles shell syntax on the target platform.
   Pros: it can invert Filename.quote; minimal surprise for the users?
   Cons: nontrivial code with two different implementations (POSIX+Win32).
2- The GCC syntax.
   Pros: we can say "it's just like GCC"; syntax is independent of platform.
   Cons: cannot invert Filename.quote; unfriendly to Windows users (backslashes in file names must always be escaped)
3- One word per line.
   Pros: trivial to parse, trivial to generate.
   Cons: not human-friendly.
(0016085)
gasche (developer)
2016-07-19 16:23

I think it would be very nice to have the property that a given arguments file works on any user system (eg. I can check my default argument list in version-control and have others devs use it), which makes (1) impractical.

I think either (2) or (3) are fine. We could easily support both with -args and one of -argsn, -argsln or -args1.
(0016086)
frisch (developer)
2016-07-19 16:31

Why is "one word per line" not human-friendly? I'd say that not needing any quoting is rather simpler and thus more human-friendly. It's also much easier to generate and response files are often generated by build systems or other tools; and having to replicate some non-trivial quoting logic (platform dependent or not) in each such tool could be tedious.
(0016097)
schommer (reporter)
2016-07-20 09:13

For the compiler:
(2) seems most suitable for me, since the gnu tools as well as clang use this
format (except for clang in CL compatibility mode). Also writing such add files from other tools is not that hard, the writeargv code from libiberty is just adding \ before each [ \r\t\n\'\"\\].
What one could additionally add support for microsoft style response files for compatibility.

Concerning the name of the new option:
I thing the reason the gcc uses the @ is that the expansion of responsefiles can happen before any argument parsing happens. Also the advantage of using @ would be that it would not clash with other options.

For an addition to Args module:
One could add two functions, expandargv and writeargv whith an optional argument that chooses the quoting variant used.
(0016261)
doligez (administrator)
2016-09-02 15:15

Option (3) is the most appealing to me. Its only drawback is that it's not surjective: you can't have a linefeed character in an argument.

So if we want to be complete (which is not obvious to me) we should probably have both -args and -args1 as suggested by Gabriel.

As for the @ syntax, unfortunately we cannot use it because of the warn-error syntax.
(0016262)
frisch (developer)
2016-09-02 15:23

Note: the discussion on the syntax for response files continued on:

   https://github.com/ocaml/ocaml/pull/748 [^]


The one about allowing "extra parameter injection in Args" on:

   https://github.com/ocaml/ocaml/pull/778 [^]
(0016978)
shinwell (developer)
2016-12-12 17:27

It looks to me as if response file support has been both implemented and documented.
(0016980)
gasche (developer)
2016-12-12 17:31

Yes, this was implemented in 4.05.0 by Bernhard Schommer -- I included the changelog entry below for reference. Thanks for the triaging.

> - PR#7050, GPR#748 GPR#843 GPR#864: new `-args/-args0 <file>` parameters to
> provide extra command-line arguments in a file -- see documentation.
> User programs may implement similar options using the new `Expand`
> constructor of the `Arg` module.
> (Bernhard Schommer, review by Jérémie Dimino, Gabriel Scherer
> and Damien Doligez, discussion with Alain Frisch and Xavier Leroy,
> feature request from the Coq team)


- Issue History
Date Modified Username Field Change
2015-11-19 17:30 gasche New Issue
2015-11-19 17:30 gasche Status new => assigned
2015-11-19 17:30 gasche Assigned To => gasche
2015-11-19 17:32 frisch Note Added: 0014734
2015-11-19 17:33 frisch Note Edited: 0014734 View Revisions
2015-11-20 13:50 xleroy Note Added: 0014736
2015-11-20 13:51 xleroy Note Edited: 0014736 View Revisions
2015-11-20 13:52 xleroy Note Edited: 0014736 View Revisions
2015-11-20 14:04 gasche Note Added: 0014737
2015-11-20 15:11 xleroy Note Added: 0014738
2015-11-20 15:16 frisch Note Added: 0014739
2015-11-20 15:25 gasche Note Added: 0014740
2015-11-20 16:11 xleroy File Added: readwords.ml
2015-11-20 16:12 xleroy Note Added: 0014745
2015-11-20 16:37 gasche Note Added: 0014747
2015-11-20 16:39 gasche Note Edited: 0014747 View Revisions
2015-11-20 16:41 frisch Note Added: 0014748
2015-11-20 16:51 gasche Note Added: 0014749
2015-11-20 16:52 gasche Note Edited: 0014747 View Revisions
2015-11-20 17:53 xleroy Note Added: 0014750
2015-11-20 18:44 frisch Note Added: 0014760
2015-11-28 19:38 xleroy Relationship added parent of 0007011
2016-04-19 20:46 gasche Target Version 4.03.0+dev / +beta1 => 4.03.1+dev
2016-04-19 20:48 gasche Note Added: 0015818
2016-06-07 16:36 schommer Note Added: 0015976
2016-07-19 16:15 xleroy Note Added: 0016084
2016-07-19 16:23 gasche Note Added: 0016085
2016-07-19 16:31 frisch Note Added: 0016086
2016-07-20 09:13 schommer Note Added: 0016097
2016-09-02 15:15 doligez Note Added: 0016261
2016-09-02 15:23 frisch Note Added: 0016262
2016-12-12 17:27 shinwell Note Added: 0016978
2016-12-12 17:29 shinwell Status assigned => resolved
2016-12-12 17:29 shinwell Resolution open => fixed
2016-12-12 17:31 gasche Note Added: 0016980
2016-12-12 17:31 gasche Assigned To gasche => administrator
2016-12-12 17:31 gasche Fixed in Version => 4.05.0 +dev/beta1/beta2/beta3/rc1
2017-02-23 16:36 doligez Category OCaml general => -OCaml general
2017-03-02 11:51 doligez Relationship added related to 0005937
2017-03-03 17:55 doligez Category -OCaml general => -(deprecated) general
2017-03-03 18:01 doligez Category -(deprecated) general => ~deprecated (was: OCaml general)
2017-03-06 17:04 doligez Category ~deprecated (was: OCaml general) => ~DO NOT USE (was: OCaml general)


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker