Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007594OCamlruntime system and C interfacepublic2017-07-21 14:512017-08-03 15:27
Reporteryallop 
Assigned Toyallop 
PrioritynormalSeverityminorReproducibilityN/A
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target Version4.06.0+devFixed in Version4.06.0+dev 
Summary0007594: Should String_val return 'const char *' when safe-string is globally enabled?
DescriptionSince GPR 687 (https://github.com/ocaml/ocaml/pull/687 [^]) it's possible to enable -safe-string globally at configure time.

There's a second GPR open (https://github.com/ocaml/ocaml/pull/1252 [^]) that makes configuration-time -safe-string the default.

At present, -safe-string only affects OCaml code; C stubs that mutate OCaml strings will continue to compile without error. This may lead to subtle bugs, not least since the compiler now takes string immutability into account during optimizations (https://github.com/ocaml/ocaml/pull/703 [^]).

Should we update String_val to return 'const char *' when -safe-string is enabled, like this?

   #ifdef CAML_SAFE_STRING
   #define String_val(x) ((const char *) Bp_val(x))
   #else
   #define String_val(x) ((char *) Bp_val(x))
   #endif

We'd also need a new 'Bytes_val' that always returned 'char *'.

This change would be slightly backwards-incompatible, but probably not any more so than defaulting to -safe-string. And it seems likely to improve the quality of code, by making it more explicit which C code can actually modify OCaml string/bytes values.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0018108)
xleroy (administrator)
2017-07-21 15:02

If we do this, the following common idiom

  char * p = String_val(v);

and probably many others will warn or break compilation if -warn-error. That means breakage not only in .ml files but also on .c files.

If anyone wants to experiment with this approach and report on the breakage, that would be most welcome.


(0018109)
xleroy (administrator)
2017-07-21 15:19

And of course we need

#define Bytes_val(x) ((char *) Bp_val(x))

in <caml/mlvalues.h>.

On second thoughts, make that "(unsigned char *)" just to gain more machine independence. ("char" can be signed or unsigned.)
(0018111)
yallop (developer)
2017-07-21 16:20

I ran some preliminary experiments, changing String_val in trunk to return 'const char *', and looking at what needed to be changed in the distribution. The branch is here:

   https://github.com/yallop/ocaml/compare/trunk...yallop:const-string-val [^]

Summary:

   * The changes in the OCaml distribution are relatively modest (66 additions, 53 deletions)

   * The signatures of several public functions need additional 'const'. Example:

      -extern void unix_error (int errcode, char * cmdname, value arg)
      +extern void unix_error (int errcode, const char * cmdname, value arg)

     This is unlikely to affect calls, since 'char *' arguments can be passed where 'const char*' is expected.

   * Unsurprisingly, several internal uses of String_val need adjustment, either to

       const char *c = String_val(...)

     or to

       unsigned char *c = Bytes_val(...)

   * Using 'unsigned char *' as the return value of Bytes_val occasionally needs extra casts in calling code. For example, caml_alloc_sprintf uses vsnprintf to write to a string, and vsnprintf's first argument is 'char *', not 'unsigned char *'

It'd also be helpful to look at some packages that use C extensions. I plan to look at a few such packages shortly.
(0018115)
xleroy (administrator)
2017-07-21 18:14

Thanks for the experiment, these are interesting data points.

> Using 'unsigned char *' as the return value of Bytes_val occasionally needs extra casts in calling code. For example, caml_alloc_sprintf uses vsnprintf to write to a string, and vsnprintf's first argument is 'char *', not 'unsigned char *'

Just one remark: I think Bytes_val should be reserved for data that has type bytes on the Caml side, not to gain write access to a Caml string. For the latter use, "(char *) String_val(v)" makes the intent clearer, in my opinion.
(0018135)
yallop (developer)
2017-07-24 22:14

> Just one remark: I think Bytes_val should be reserved for data that has type bytes on the Caml side, not to gain write access to a Caml string. For the latter use, "(char *) String_val(v)" makes the intent clearer, in my opinion.

Thanks for the clarification. I agree that what you suggest is better.

Ok, here are a few statistics about String_val and OPAM packages.

Of the OPAM packages that call String_val (and that aren't explicitly constrained to an earlier OCaml version)

  - 130 install without problems

  - 94 aren't installable on trunk because of version constraints in dependencies

  - 10 are not installable on my system because of external dependency issues

  - 17 don't compile with OCaml trunk due to problems in OCaml code or unrelated C problems

  - 4 packages actually fail with String_val related-problems when String_val is changed to return 'const char *':
     delimcc
     ocaml-expat
     cairo
     milter
(0018149)
shinwell (developer)
2017-07-31 17:24

I'm in favour of a change along these lines.
(0018158)
yallop (developer)
2017-08-02 12:22

I've opened a pull request: https://github.com/ocaml/ocaml/pull/1274 [^]

- Issue History
Date Modified Username Field Change
2017-07-21 14:51 yallop New Issue
2017-07-21 15:02 xleroy Note Added: 0018108
2017-07-21 15:02 xleroy Status new => acknowledged
2017-07-21 15:19 xleroy Note Added: 0018109
2017-07-21 16:20 yallop Note Added: 0018111
2017-07-21 18:14 xleroy Note Added: 0018115
2017-07-24 22:14 yallop Note Added: 0018135
2017-07-31 17:24 shinwell Note Added: 0018149
2017-08-02 12:22 yallop Note Added: 0018158
2017-08-02 16:02 shinwell Status acknowledged => resolved
2017-08-02 16:02 shinwell Resolution open => duplicate
2017-08-02 16:02 shinwell Assigned To => yallop
2017-08-03 15:27 yallop Status resolved => assigned
2017-08-03 15:27 yallop Status assigned => resolved
2017-08-03 15:27 yallop Fixed in Version => 4.06.0+dev
2017-08-03 15:27 yallop Resolution duplicate => fixed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker