Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An "-args <path>" option to pass command-line flags in a file #7050

Closed
vicuna opened this issue Nov 19, 2015 · 22 comments
Closed

An "-args <path>" option to pass command-line flags in a file #7050

vicuna opened this issue Nov 19, 2015 · 22 comments
Milestone

Comments

@vicuna
Copy link

vicuna commented Nov 19, 2015

Original bug ID: 7050
Reporter: @gasche
Assigned to: administrator
Status: resolved (set by @mshinwell on 2016-12-12T16:29:31Z)
Resolution: fixed
Priority: normal
Severity: feature
Target version: 4.03.1+dev
Fixed in version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Category: ~DO NOT USE (was: OCaml general)
Related to: #5937
Parent of: #7011
Monitored by: @gasche

Bug description

The Coq team had to disable native_compute on Windows because the "-I .." flags to be passed to the OCaml compiler would overflow Windows' command-line size limit.

We removed general @args-file handling from the OCaml runtime, but there is clear need for support for a similar size-limit-avoiding feature for the tools of the compiler distribution.

Additional information

One option would be to try to add this feature to the Args module in general. I'm fine with some sort of (Args.from_file ...) call that would help OCaml programmers implement this in their projects, but I would like to restrict the scope of the present PR to a compiler-tools-only option in the interest of minimality, and the hope of reaching trunk before the December feature freeze.

File attachments

@vicuna
Copy link
Author

vicuna commented Nov 19, 2015

Comment author: @alainfrisch

flexdll would also probably need something similar (and the compiler will have to use the feature when calling it).

A similar topic is ocaml/flexdll#7 (this is about calling external commands from flexdll, not its own CLI).

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @xavierleroy

Before we start coding like crazy, it would be good to agree on a format for the contents of response files. Several formats exist in the wild:

1- One argument = one line (without the terminating newline). Makes it trivial to have spaces in arguments.
1a- Variant: have an escape to represent newlines inside arguments (dubious)
1b- Variant: if there are null bytes in the file, split arguments on null bytes instead (for compatibility with find -print0).

2- One whitespace-delimited word = one argument. To handle space in arguments, we need a quoting mechanism:
2a- POSIX shell quoting (single quotes, double quotes, backslash escapes, all the work)
2b- Windows style quoting (creative use of double quotes and backslashes)
2c- Whatever is the inverse of Filename.quote on the current platform.

Each of these is a small matter of programming, but please let's agree on a spec first.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @gasche

I want the feature and am willing to be flexible on the details, so for me it's a "whatever the people with a strong opinion agree on".

Below are a few personal (mild) preferences.

a) I would avoid null bytes to make it easier for non-expert users to edit in a text editor etc. (so no 1b).

b) This is not a hard design goal, but it would be nice if we could have enough platform-independence to let users distribute args-file in their project and share them across systems (eg. "my preferred warning settings are to be found in mywarning.args, ready to use with -args").

c) On the other hand, it is important to be able to programmatically generate args-file, in particular from an OCaml program (this is the Coq use-case: I want to synthetize a call to the OCaml compiler with a very long list of include flags). For filesystem paths in particular, there must be a simple way to quote them from an OCaml program (preferably in the stdlib already) that is supported.

(Note that the use-cases for system-agnostic sharing of args-file (b) I can think of do not involve passing filesystem paths as options (which are not portable anyway), except maybe relative paths inside the project's source repository. Is there one relative path escaping convention that all Sys implementations support?)

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @xavierleroy

Is there one relative path escaping convention that all Sys implementations support?

If the filename contains no backslashes nor double quotes (implying: on Windows you use forward slashes as directory delimiters), you can put double quotes around it and the result reads back identically with POSIX shell conventions and with Windows conventions. Otherwise, the two conventions differ fundamentally.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @alainfrisch

I'd go for 1, which is the easiest one. What would be the practical interest of supporting newlines or null bytes in command lines arguments?

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @gasche

I'm fine with 1 as it seems to support all my preferences (better: it requires no quoting, so file paths have an non-ambiguous meaning regardless of the OS).

Given that my feature request is only for the tools of the compiler distribution, and that none of those tools expect arbitrary text as argument, the unability to represent \n will not be a problem. If we someday decide to generalize the feature to, say, users of the Args module, we may regret that choice (but even then...), but I'd be happy with the simplest solution.

If someone is going to wince at the lack of generality of the approach, it's probably Damien.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @xavierleroy

Attached is a proof-of-concept implementation of approach #2. It implements two flavors of double quoting, the POSIX-like one and the Win32-like one. I'm pretty sure that it correctly inverts the effect of Filename.quote on the respective platforms.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @gasche

(I removed the .unquote suggestion that was here, the signature and usage are too different.)

I'm worried people would find it overkill, but we could have three options, -args, -args-unix and -args-win, with the two later fixing an explicit convention and the first defaulting on the current OS. This solves both my needs (b) and (c). If I can't get my three options, we can also drop (b: portable use) in favor of (c: script-friendly from the same machine) by just having -args.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @alainfrisch

Has anyone ever called the compiler or related tools with arguments containing newline characters? (I could imagine this might be the case for ocamldoc, for some textual arguments, but I strongly doubt it.) Or is this discussion only justified by a desire to be generic enough to support inclusion in Args?

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @gasche

On a purely subjective level, I have a mild preference for telling users "just write in that file whatever you would write in the command-line", which is allowed by approach (2) (Xavier's patch) but not by approach 1 (they have to remember to use a very-simple-but-different syntax).

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @xavierleroy

Alain, don't obsess over newlines in filenames, it's anecdotal. But consider this. We need at least to handle file and directory paths that contain spaces, so some form of quoting is needed. Then, we should handle flags like -pp and -ppx, which take possibly complex commands as arguments, hence nested quotes can occur.

At this point, we can either go for shell-like syntax that users are already familiar with, or with more ad-hoc approaches like "one argument per line" that they will struggle with.

And, yes, I'm shooting for something generic enough to be put in module Args.

@vicuna
Copy link
Author

vicuna commented Nov 20, 2015

Comment author: @alainfrisch

contain spaces, so some form of quoting is needed

Do you consider "one argument per line" as some form of quoting?

"one argument per line" that they will struggle with.

What would be the difficulty? I expect these args to be most commonly generated (in which case the "one argument per line, no quoting" is rather straightforward), and rarely user-written (and even in that case, not having to use different quoting depending on the OS is beneficial to portability).

The direct correspondence between lines in the file and what goes into Sys.argv is nice and simple. This might be a developer-, not a user-centric perspective, but I tend to consider the need for quoting on the command-line as a superficial (and painful) detail of current shell syntax, not anything deeper. This is especially coherent with how arguments are passed to processes on Unix; much less so on Windows.

@vicuna
Copy link
Author

vicuna commented Apr 19, 2016

Comment author: @gasche

I think we missed the mark on 4.03. I would still be interested in having the feature in.

If there are good justifications for supporting "like in the command-line" syntax and a "one argument per line", why not have both? We could have -args be the shell-quoted one, and -args1 be the one-per-line one (the 1 comes from the "ls -1" option, it may appear in other places).

@vicuna
Copy link
Author

vicuna commented Jun 7, 2016

Comment author: schommer

I would be in favor of having some kind of response file support, since it seems that the gnumake for windows has an even more limited command line length and for example the dependency generation of CompCert seems to hit this limit.

I would suggest using the same convention as gcc does for @files:

"Read command-line options from file. The options read are inserted in place of the original @file option. If file does not exist, or cannot be read, then the option will be treated literally, and not removed.
Options in file are separated by whitespace. A whitespace character may be included in an option by surrounding the entire option in either single or double quotes. Any character (including a backslash) may be included by prefixing the character to be included with a backslash. The file may itself contain additional @file options; any such options will be processed recursively."

@vicuna
Copy link
Author

vicuna commented Jul 19, 2016

Comment author: @xavierleroy

Bernhard Schommer and I discussed this issue in another context (CompCert also needs response files...). I realized that my readword.ml sample implementation is not quite right for the Win32 case, but I know how to fix it.

The question remains: what syntax do we want for response files?
1- Something that resembles shell syntax on the target platform.
Pros: it can invert Filename.quote; minimal surprise for the users?
Cons: nontrivial code with two different implementations (POSIX+Win32).
2- The GCC syntax.
Pros: we can say "it's just like GCC"; syntax is independent of platform.
Cons: cannot invert Filename.quote; unfriendly to Windows users (backslashes in file names must always be escaped)
3- One word per line.
Pros: trivial to parse, trivial to generate.
Cons: not human-friendly.

@vicuna
Copy link
Author

vicuna commented Jul 19, 2016

Comment author: @gasche

I think it would be very nice to have the property that a given arguments file works on any user system (eg. I can check my default argument list in version-control and have others devs use it), which makes (1) impractical.

I think either (2) or (3) are fine. We could easily support both with -args and one of -argsn, -argsln or -args1.

@vicuna
Copy link
Author

vicuna commented Jul 19, 2016

Comment author: @alainfrisch

Why is "one word per line" not human-friendly? I'd say that not needing any quoting is rather simpler and thus more human-friendly. It's also much easier to generate and response files are often generated by build systems or other tools; and having to replicate some non-trivial quoting logic (platform dependent or not) in each such tool could be tedious.

@vicuna
Copy link
Author

vicuna commented Jul 20, 2016

Comment author: schommer

For the compiler:
(2) seems most suitable for me, since the gnu tools as well as clang use this
format (except for clang in CL compatibility mode). Also writing such add files from other tools is not that hard, the writeargv code from libiberty is just adding \ before each [ \r\t\n'"\].
What one could additionally add support for microsoft style response files for compatibility.

Concerning the name of the new option:
I thing the reason the gcc uses the @ is that the expansion of responsefiles can happen before any argument parsing happens. Also the advantage of using @ would be that it would not clash with other options.

For an addition to Args module:
One could add two functions, expandargv and writeargv whith an optional argument that chooses the quoting variant used.

@vicuna
Copy link
Author

vicuna commented Sep 2, 2016

Comment author: @damiendoligez

Option (3) is the most appealing to me. Its only drawback is that it's not surjective: you can't have a linefeed character in an argument.

So if we want to be complete (which is not obvious to me) we should probably have both -args and -args1 as suggested by Gabriel.

As for the @ syntax, unfortunately we cannot use it because of the warn-error syntax.

@vicuna
Copy link
Author

vicuna commented Sep 2, 2016

Comment author: @alainfrisch

Note: the discussion on the syntax for response files continued on:

#748

The one about allowing "extra parameter injection in Args" on:

#778

@vicuna
Copy link
Author

vicuna commented Dec 12, 2016

Comment author: @mshinwell

It looks to me as if response file support has been both implemented and documented.

@vicuna vicuna closed this as completed Dec 12, 2016
@vicuna
Copy link
Author

vicuna commented Dec 12, 2016

Comment author: @gasche

Yes, this was implemented in 4.05.0 by Bernhard Schommer -- I included the changelog entry below for reference. Thanks for the triaging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant