Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gc.value_{word,byte}_size : 'a -> int #6829

Closed
vicuna opened this issue Apr 2, 2015 · 4 comments
Closed

Add Gc.value_{word,byte}_size : 'a -> int #6829

vicuna opened this issue Apr 2, 2015 · 4 comments

Comments

@vicuna
Copy link

vicuna commented Apr 2, 2015

Original bug ID: 6829
Reporter: @dbuenzli
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-12-06T15:28:38Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 4.02.1
Fixed in version: 4.04.0
Category: standard library
Monitored by: @diml @hcarty @dbuenzli

Bug description

It would be nice to have a function in the standard library that allows to get the size of an arbitrary value. This is useful when trying alternate encodings and/or data structures e.g. to minimise in-memory representations of a particular data set.

For now to get a quick estimation I sometimes simply marshal the value to a string, or resort to ad-hoc functions that perform the count on the datastructure at hand e.g.:

https://github.com/dbuenzli/uucp/blob/master/src/uucp_cmap.ml#L57-L63
https://github.com/dbuenzli/uucp/blob/master/src/uucp_rmap.ml#L60-L66
https://github.com/dbuenzli/uucp/blob/master/src/uucp_tmap.ml#L39-L56
https://github.com/dbuenzli/uucp/blob/master/src/uucp_tmapbool.ml#L53-L65

But could we maybe rather have something like Jean-Christophe Filliatre's size functions [1] directly in the Gc module ?

Thanks,

Daniel

[1] https://www.lri.fr/~filliatr/ftp/ocaml/ds/size.ml.html

@vicuna
Copy link
Author

vicuna commented Apr 6, 2015

Comment author: @ygrek

see also http://www.bytebucket.org/gds/objsize.git

@vicuna
Copy link
Author

vicuna commented Jun 11, 2015

Comment author: @damiendoligez

Any implementation will basically be a rewrite of the Marshal functions, so we probably should just modify the marshalling primitives to have an option to output only the size.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2015

Comment author: @alainfrisch

I think it would be cleaner to create a new function instead. The body of the marshal function, i.e. extern_rec in byterun/extern.c is quite small, and if we keep only the logic to count the size, it would be even smaller (e.g. we don't need to handle various cases for "small integers" or "short strings"; and one can merge cases String_tag/Double_tag/Double_array_tag/Custom_tag).

@vicuna
Copy link
Author

vicuna commented Apr 25, 2016

Comment author: @dbuenzli

GPR #560

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants