Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006141OCamlOCaml documentationpublic2013-08-27 16:222014-07-25 16:48
Reporterjohnwhitington 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
StatusnewResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0006141: Document Char.unsafe_chr
DescriptionRationale:

1) Char.unsafe_chr is widely used in third-party low level libraries which are not part of stdlib.

2) People who don't know Char.unsafe_chr exists replicate it using Obj.magic (this is what I used to do) This is bad.

3) It's no more unsafe that other unsafe_ functions in the stdlib which are documented.

As an example, in very fast lexing code, where we know an integer is actually a byte in the range 0...255, Char.chr can be quite a cost.
Tagsjunior_job, patch
Attached Filesdiff file icon unsafe_chr_documentation.diff [^] (581 bytes) 2013-12-29 16:58 [Show Content]

- Relationships

-  Notes
(0010769)
pminten (reporter)
2013-12-29 16:59

Here's a patch to add a bit of documentation to Char.unsafe_chr.
(0011906)
doligez (administrator)
2014-07-24 20:18

"the program might crash" is bad (and I don't believe it's true with the current implementation). We're trying to move away from unsafe functions and anything that can cause a crash if misused: for example we sometimes daydream about removing Array.unsafe_get and unsafe_set.

So I'm a bit unsatisfied with this version.
(0011907)
dbuenzli (reporter)
2014-07-24 20:53

I understand the philosophical point of view but when working on encoders/decoders it's still useful to be able to bypass the array bound checking on each access since you usually already checked sizes just before accessing a chunk of bytes and if I (once) measured correctly the performance improvements in doing so are non-negligible. So usually that's what I do:

https://github.com/dbuenzli/uutf/blob/master/src/uutf.ml#L14 [^]
https://github.com/dbuenzli/jsonm/blob/master/src/jsonm.ml#L12 [^]
https://github.com/dbuenzli/dicomm/blob/master/src/dicomm.ml#L23 [^]
https://github.com/dbuenzli/otfm/blob/master/src/otfm.ml#L11 [^]
etc.

But that's not really satisfactory, as you need to change the source code to make the module safe. Now we have the -unsafe option that turns *all* safe accesses into unsafe ones, which may be a little bit to coarse grained (not to mention people actually wrongly *relying* on the Invalid_argument exception raised by out of bound accesses). I would argue that we actually want the converse, i.e. a -safe option that turns unsafe accesses into safe ones. Do you think such a scheme could be implemented in the compiler ?
(0011923)
johnwhitington (reporter)
2014-07-25 16:48

The unsafe_ functions are, indeed, most dangerous because they can cause corruption which eventually causes a crash distant from the call to an _unsafe function. This makes for an even worse problem than a stack overflow from a non tail-recursive Standard Library function which crashes quickly, or a floating Not_found.

Perhaps the text in the patch needs finessing, but I'd like to see it documented, if only to prevent people using Obj.magic.

it would be a nice idea to grep through the whole of OPAM for uses of Obj.magic in third-party code, and see how many are actually required, and how many could be replaced with existing functions like Char.unsafe_chr.

- Issue History
Date Modified Username Field Change
2013-08-27 16:22 johnwhitington New Issue
2013-08-28 11:05 doligez Status new => confirmed
2013-08-28 11:05 doligez Target Version => 4.01.1+dev
2013-09-03 17:18 gasche Tag Attached: junior_job
2013-12-29 16:58 pminten File Added: unsafe_chr_documentation.diff
2013-12-29 16:59 pminten Note Added: 0010769
2014-01-17 17:57 doligez Tag Attached: patch
2014-05-25 20:20 doligez Target Version 4.01.1+dev => 4.02.0+dev
2014-07-24 20:18 doligez Note Added: 0011906
2014-07-24 20:18 doligez Severity minor => feature
2014-07-24 20:18 doligez Status confirmed => feedback
2014-07-24 20:18 doligez Target Version 4.02.0+dev =>
2014-07-24 20:53 dbuenzli Note Added: 0011907
2014-07-25 16:48 johnwhitington Note Added: 0011923
2014-07-25 16:48 johnwhitington Status feedback => new


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker