New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[patch] add primitives for directly reading 2, 4 or 8 bytes in strings and char bigarrays #5771
Comments
Comment author: gerd Supporting this. I think we should also add primitives for nativeint, and for plain int (to avoid boxing when going through int32/int64). |
Comment author: @chambart It is not needed to add a special primitive for int, if you write a function like this: let f s i = Int32.to_int (caml_bigstring_get_32 s i) there will be no allocaion: the compiler avoid boxing the value |
Comment author: @chambart Add a version updated for current trunk. |
Comment author: @lefessan Integrated in trunk at revision r13087. |
Comment author: @alainfrisch This commit breaks the MSVC port (in str.c and bigarray_stubs.c: it is not allowed to declare variables after the first statement in a function body). |
Comment author: @alainfrisch Same question as #5774: what's the point of adding compiler support for those primitives if they are not exposed to user-land code? I understand this can be done in external libraries, but why not do it in stdlib? |
Comment author: @lefessan The patch has been applied a long time ago. The library ocplib-endian uses these primitives when they are available, and non optimized primitives otherwise (while still using as little allocations as possible) in both cases. |
Original bug ID: 5771
Reporter: @chambart
Assigned to: @lefessan
Status: closed (set by @xavierleroy on 2015-12-11T18:20:00Z)
Resolution: fixed
Priority: normal
Severity: feature
Fixed in version: 4.01.0+dev
Category: back end (clambda to assembly)
Monitored by: @ygrek @chambart @hcarty
Bug description
This patch provides primitives to improve speed of code reading a lot of values from network/files. When those are implemented using C stubs, quite a lot of time is spent in the calls.
the provided primitives are:
type bigstring = (char, int8_unsigned_elt, c_layout) Array1.t
external caml_bigstring_get_16 : bigstring -> int -> int = "%caml_bigstring_get16"
external caml_bigstring_get_32 : bigstring -> int -> int32 = "%caml_bigstring_get32"
external caml_bigstring_get_64 : bigstring -> int -> int64 = "%caml_bigstring_get64"
external caml_bigstring_set_16 : bigstring -> int -> int -> unit = "%caml_bigstring_set16"
external caml_bigstring_set_32 : bigstring -> int -> int32 -> unit = "%caml_bigstring_set32"
external caml_bigstring_set_64 : bigstring -> int -> int64 -> unit = "%caml_bigstring_set64"
and the equivalent ones on strings.
external caml_string_get_16 : string -> int -> int = "%caml_string_get16"
external caml_string_get_32 : string -> int -> int32 = "%caml_string_get32"
external caml_string_get_64 : string -> int -> int64 = "%caml_string_get64"
external caml_string_set_16 : string -> int -> int -> unit = "%caml_string_set16"
external caml_string_set_32 : string -> int -> int32 -> unit = "%caml_string_set32"
external caml_string_set_64 : string -> int -> int64 -> unit = "%caml_string_set64"
Unsafe versions of the primitives also exist.
Additional information
Those primitives allow loading values that are not alligned.
On architectures that does not allow unaligned access this is implemented by loading one byte at a time.
It is allowed on x86 and x86-64, but for the other architectures I made the safe guess that it is forbiden. Is it effectively the case on power processors ?
On a future version of the patch I can implement a more efficient unaligned load that requires only 2 loads and some shifts.
File attachments
The text was updated successfully, but these errors were encountered: