|Anonymous | Login | Signup for a new account||2018-05-25 11:09 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0007594||OCaml||runtime system and C interface||public||2017-07-21 14:51||2017-08-03 15:27|
|Target Version||4.06.0 +dev/beta1/beta2/rc1||Fixed in Version||4.06.0 +dev/beta1/beta2/rc1|
|Summary||0007594: Should String_val return 'const char *' when safe-string is globally enabled?|
|Description||Since GPR 687 (https://github.com/ocaml/ocaml/pull/687 [^]) it's possible to enable -safe-string globally at configure time.|
There's a second GPR open (https://github.com/ocaml/ocaml/pull/1252 [^]) that makes configuration-time -safe-string the default.
At present, -safe-string only affects OCaml code; C stubs that mutate OCaml strings will continue to compile without error. This may lead to subtle bugs, not least since the compiler now takes string immutability into account during optimizations (https://github.com/ocaml/ocaml/pull/703 [^]).
Should we update String_val to return 'const char *' when -safe-string is enabled, like this?
#define String_val(x) ((const char *) Bp_val(x))
#define String_val(x) ((char *) Bp_val(x))
We'd also need a new 'Bytes_val' that always returned 'char *'.
This change would be slightly backwards-incompatible, but probably not any more so than defaulting to -safe-string. And it seems likely to improve the quality of code, by making it more explicit which C code can actually modify OCaml string/bytes values.
|Tags||No tags attached.|
If we do this, the following common idiom
char * p = String_val(v);
and probably many others will warn or break compilation if -warn-error. That means breakage not only in .ml files but also on .c files.
If anyone wants to experiment with this approach and report on the breakage, that would be most welcome.
And of course we need
#define Bytes_val(x) ((char *) Bp_val(x))
On second thoughts, make that "(unsigned char *)" just to gain more machine independence. ("char" can be signed or unsigned.)
I ran some preliminary experiments, changing String_val in trunk to return 'const char *', and looking at what needed to be changed in the distribution. The branch is here:
* The changes in the OCaml distribution are relatively modest (66 additions, 53 deletions)
* The signatures of several public functions need additional 'const'. Example:
-extern void unix_error (int errcode, char * cmdname, value arg)
+extern void unix_error (int errcode, const char * cmdname, value arg)
This is unlikely to affect calls, since 'char *' arguments can be passed where 'const char*' is expected.
* Unsurprisingly, several internal uses of String_val need adjustment, either to
const char *c = String_val(...)
unsigned char *c = Bytes_val(...)
* Using 'unsigned char *' as the return value of Bytes_val occasionally needs extra casts in calling code. For example, caml_alloc_sprintf uses vsnprintf to write to a string, and vsnprintf's first argument is 'char *', not 'unsigned char *'
It'd also be helpful to look at some packages that use C extensions. I plan to look at a few such packages shortly.
Thanks for the experiment, these are interesting data points.
> Using 'unsigned char *' as the return value of Bytes_val occasionally needs extra casts in calling code. For example, caml_alloc_sprintf uses vsnprintf to write to a string, and vsnprintf's first argument is 'char *', not 'unsigned char *'
Just one remark: I think Bytes_val should be reserved for data that has type bytes on the Caml side, not to gain write access to a Caml string. For the latter use, "(char *) String_val(v)" makes the intent clearer, in my opinion.
> Just one remark: I think Bytes_val should be reserved for data that has type bytes on the Caml side, not to gain write access to a Caml string. For the latter use, "(char *) String_val(v)" makes the intent clearer, in my opinion.
Thanks for the clarification. I agree that what you suggest is better.
Ok, here are a few statistics about String_val and OPAM packages.
Of the OPAM packages that call String_val (and that aren't explicitly constrained to an earlier OCaml version)
- 130 install without problems
- 94 aren't installable on trunk because of version constraints in dependencies
- 10 are not installable on my system because of external dependency issues
- 17 don't compile with OCaml trunk due to problems in OCaml code or unrelated C problems
- 4 packages actually fail with String_val related-problems when String_val is changed to return 'const char *':
|I'm in favour of a change along these lines.|
|I've opened a pull request: https://github.com/ocaml/ocaml/pull/1274 [^]|
|2017-07-21 14:51||yallop||New Issue|
|2017-07-21 15:02||xleroy||Note Added: 0018108|
|2017-07-21 15:02||xleroy||Status||new => acknowledged|
|2017-07-21 15:19||xleroy||Note Added: 0018109|
|2017-07-21 16:20||yallop||Note Added: 0018111|
|2017-07-21 18:14||xleroy||Note Added: 0018115|
|2017-07-24 22:14||yallop||Note Added: 0018135|
|2017-07-31 17:24||shinwell||Note Added: 0018149|
|2017-08-02 12:22||yallop||Note Added: 0018158|
|2017-08-02 16:02||shinwell||Status||acknowledged => resolved|
|2017-08-02 16:02||shinwell||Resolution||open => duplicate|
|2017-08-02 16:02||shinwell||Assigned To||=> yallop|
|2017-08-03 15:27||yallop||Status||resolved => assigned|
|2017-08-03 15:27||yallop||Status||assigned => resolved|
|2017-08-03 15:27||yallop||Fixed in Version||=> 4.06.0 +dev/beta1/beta2/rc1|
|2017-08-03 15:27||yallop||Resolution||duplicate => fixed|
|Copyright © 2000 - 2011 MantisBT Group|