Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read/write binary representation of int32, int64 and float #5494

Closed
vicuna opened this issue Jan 27, 2012 · 6 comments
Closed

Read/write binary representation of int32, int64 and float #5494

vicuna opened this issue Jan 27, 2012 · 6 comments

Comments

@vicuna
Copy link

vicuna commented Jan 27, 2012

Original bug ID: 5494
Reporter: @mjambon
Status: resolved (set by @xavierleroy on 2012-03-14T09:37:25Z)
Resolution: suspended
Priority: normal
Severity: feature
Platform: All
Version: 3.12.1
Category: standard library

Bug description

It would be convenient to have fast and reliable functions for reading
and writing the binary representation of int32, int64 and float
(without going through int64). Here is an interface suggestion:

All functions read or write a string at the specified position.

Modules Int32 and Int64:

val read_binary : string -> int -> t
val write_binary : string -> int -> t -> unit
val unsafe_read_binary : string -> int -> t
val unsafe_write_binary : string -> int -> t -> unit

Module Pervasives if not Float:

val read_binary_float : string -> int -> float
val write_binary_float : string -> int -> float -> unit
val unsafe_read_binary_float : string -> int -> float
val unsafe_write_binary_float : string -> int -> float -> unit

For the record, I am currently using the following code for floats
in https://github.com/mjambon/biniou/blob/master/bi_io.ml .
It is somewhat fragile to say the least:

let float_endianness =
match String.unsafe_get (Obj.magic 1.0) 0 with
'\x3f' -> Big | '\x00' -> Little
| _ -> assert false

let read_untagged_float64 ib =
let i = Bi_inbuf.read ib 8 in
let s = ib.i_s in
let x = Obj.new_block Obj.double_tag 8 in
(match float_endianness with
Little -> for j = 0 to 7 do String.unsafe_set (Obj.obj x) (7-j) (String.unsafe_get s (i+j)) done | Big ->
for j = 0 to 7 do
String.unsafe_set (Obj.obj x) j (String.unsafe_get s (i+j))
done
);
(Obj.obj x : float)

let write_untagged_float64 ob x =
let i = Bi_outbuf.alloc ob 8 in
let s = ob.o_s in
(match float_endianness with
Little -> for j = 0 to 7 do String.unsafe_set s (i+j) (String.unsafe_get (Obj.magic x) (7-j)) done | Big ->
for j = 0 to 7 do
String.unsafe_set s (i+j) (String.unsafe_get (Obj.magic x) j)
done
)

let () =
let s = "\x3f\xf0\x06\x05\x04\x03\x02\x01" in
let x = 1.00146962706651288 in
let y = read_untagged_float64 (Bi_inbuf.from_string s) in
if x <> y then
assert false;
let ob = Bi_outbuf.create 8 in
write_untagged_float64 ob x;
if Bi_outbuf.contents ob <> s then
assert false

@vicuna
Copy link
Author

vicuna commented Mar 14, 2012

Comment author: @xavierleroy

It's always hard to draw a line between what should go in the OCaml stdlib and what is best done in external libraries. In this particular case, before considering inclusion in the stdlib, I'd like to see the interface and implementation worked out in an external library, perhaps Batteries? or even a specific "fast I/O" library?

@vicuna vicuna closed this as completed Mar 14, 2012
@vicuna
Copy link
Author

vicuna commented Mar 14, 2012

Comment author: @mjambon

Agreed. These functions are probably not more useful than most requests for inclusion in the stdlib, but they are harder to maintain because they rely on undocumented features. Let's see what the Batteries team thinks of this.

@vicuna
Copy link
Author

vicuna commented Mar 19, 2012

Comment author: @mjambon

The question just popped up on Stack Overflow: http://stackoverflow.com/questions/9776245/ocaml-int-to-binary-string-conversion

This is the Batteries-devel thread: https://lists.forge.ocamlcore.org/pipermail/batteries-devel/2012-March/001600.html

@vicuna
Copy link
Author

vicuna commented Mar 20, 2012

Comment author: meyer

If I can just add my two cents - I perceive stdlib always as being just enough to bootstrap OCaml system + a little bit of convenience. That's what I personally think is a good balance. However the line is blurry what's being this additional convenience... Just a general thought...

However, on other hand, there are some modules that are just must have and provide basic blocks for example for system programming which are not used to just compile OCaml (e.g. Unix module or Bignum..), but in these case I think the stdlib serves the same purpose as C standard library for instance as a still lean "standard library".

Just two cents,
Wojciech

@vicuna
Copy link
Author

vicuna commented Nov 15, 2012

Comment author: warwick

I just wanted to add the opinion that I think it's good if the OCaml standard library does certain things in quite a powerful way, and other things not at all.

So for example the List module should (and does) have powerful features for manipulating lists. Otherwise programmers have to work with basic and extended versions of all the core modules: List and 'ListExtra', Array and 'ArrayExtra', String and 'StringExtra', etc. I think it's confusing and a bit inefficient to have this split between basic and extended modules handling the same data structures.

Then there can be other things that the standard library doesn't do at all, such as data compression, sending e-mail, etc ... and it's clear that you need an external library for these things.

Just my two cents as well!

Warwick

@JasonCreighton
Copy link

Just wanted to add a clarifying note for future people who (like me) may find this issue through Google:

The requested capability exists in present-day OCaml. Ever since OCaml 4.08, the Bytes module has a family of functions named like "get_int32_le", "set_uint32_be", etc, to get and set integers of various size, signedness, and endianness.

In addition, the Buffer module has similar functions to append integers to a Buffer.

To handle floats, the Int32 and Int64 modules have functions "bits_of_float" and "float_of_bits" to convert between floating point numbers and their bit representations, which can then be used with the functions in Bytes and Buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants