Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving marshaling of big values on 64 bit platforms. #6910

Closed
vicuna opened this issue Jun 19, 2015 · 5 comments
Closed

Improving marshaling of big values on 64 bit platforms. #6910

vicuna opened this issue Jun 19, 2015 · 5 comments
Milestone

Comments

@vicuna
Copy link

vicuna commented Jun 19, 2015

Original bug ID: 6910
Reporter: braibant
Status: closed (set by @xavierleroy on 2017-02-16T14:16:31Z)
Resolution: fixed
Priority: normal
Severity: minor
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: standard library
Monitored by: @yakobowski

Bug description

In byterun/extern.c, the extern_value function does not handle
large values whose size cannot be written in the header, and may
fail with "output_value: object too big". (From the code, I
gather that the header is made of (4) 32 bit words, hence it's
not possible to serialize object whose size is bigger than that.)

On the other hand, the Marhsal module contains an explicit flag
to ensure 32 bit compatibility on integers (type int). When
this flag is set, each int must fall in the range 32 bit range
[-2^30 -- 2^30 - 1] to ensure that the serialized object can be
unmarshalled on a 32 bit platform. This does not affect the size
of the object, which must fit on a 32 bit word for compatibility.

When using OCaml on a 64 bit platform, this can be a limitation,
superfluous if one is not interested in compatibility of the
output values with 32 bit platforms.

I wonder to what extent it would be acceptable to propose a flag
for Marshal that would drop the 32 bit compatibility on 64 bit
platforms. (This would be a dual of Compat_32.)

There is a potential issue, when unmarshaling values on a 64 bit
platform: one need to be able to distinguish the old headers and
the No_compat_32 headers. (I must confess that I do not know
whether Marshalled values are compatible from one version of
OCaml to the other and how hard it would be to change the header
format.)

@vicuna
Copy link
Author

vicuna commented Jun 19, 2015

Comment author: @damiendoligez

We could change the marshal format to have 64-bit words in the header, check that they are less than 2^32 when demarshaling on a 32-bit machine, and make the Compat_32 flag check that too.

That would mean no API change, but a format change. The format is not guaranteed to be compatible across OCaml versions, but there is always some unhappiness when we change it.

On the other hand, these days there are several stable-format serializers for OCaml (binio, sexplib, json) so changing the marshal format would probably not impact too many people.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2015

Comment author: @xavierleroy

Applications that use marshaling for small data packets (e.g. the Ensemble protocol stack, if it still exists), might object to the increase in size of the header.

One way to have our cake and eat it too is the following:

  • If marshaled data fits in 2^32 bytes, use the old header format (with 32-bit sizes) and the old magic number.
  • Otherwise, if Compat_32 is set, fail.
  • Otherwise, use a new header format (with 64-bit sizes) and a new magic number.

This would complicate a bit the logic in byterun/extern.c, as the header size os not known until the data has been marshaled.

Also, in byterun/caml/intext.h, we will need new "codes": CODE_SHARED64, CODE_STRING64, CODE_DOUBLE_ARRAY64_{BIG,LITTLE}. We have enough free opcodes, though.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2015

Comment author: @damiendoligez

We'll also have to hack something around Marshal.header_size, probably by making sure that the data size is stored at the beginning of the header in the new format.

@vicuna
Copy link
Author

vicuna commented Aug 3, 2015

Comment author: @xavierleroy

See proposed implementation and discussion at #224

@vicuna
Copy link
Author

vicuna commented Nov 28, 2015

Comment author: @xavierleroy

GPR 224 was merged, this will be in 4.03.

@vicuna vicuna closed this as completed Feb 16, 2017
@vicuna vicuna added the stdlib label Mar 14, 2019
@vicuna vicuna added this to the 4.03.0 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant