New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion to add the "bytes" type string constants #7797
Comments
Comment author: @gasche If we had bytes literal, the only safe general way to give them a meaning would be to define your proposed The reason why we would need a copy in the general case is that string literals in the code are allocated globally for the module, and will thus be shared, while bytes cannot be shared so each occurrence of a byte literal must result in a fresh allocation. For example, consider for i = 0 to n do With string literals, "foo" is a piece of module-global data, and all invocations to (f) get a pointer to the same string. With a bytes literal b"foo", doing the same thing would be deeply unsound: if the first iteration of (f) modifies the string to "bar", you don't want the next iteration to be invoked with "bar" instead of "foo" as an argument -- this is a bug that you could actually observe with older OCaml versions using mutable strings. So you need to allocate a fresh copy of the literal b"foo", and there is no faster way than to allocate the memory and then do a blit from a global string constant, which is exactly what (Bytes.of_string "foo") of do. (It would be possible to optimize slightly by statically computing the length of the new string to be allocated, but we could do this optimization for all I argued that in the general case you always need to allocate a fresh value for a bytes literal, but note that (some_var = "foo") is a special case: there you don't actually need to allocate a new string, because no mutation will occur as part of the equality test. So you could actually write (Bytes.unsafe_to_string some_var = "foo"), which performs no copy -- and I find nicer than casting "foo" to bytes, as temporarily pretending that things are immutable is nicer than the other way around. It's easy for the user to write this; the compiler could figure out by itself that a Bytes.of_string operation on "foo" here is not necessary and rewrite it to the unsafe copy, but then it could do this in all case of comparison with strings, not just those arising from a byte literal. Again, byte literals bring you no benefit. |
Comment author: @xavierleroy There are no array literals strictly speaking. There is a construct [|e1; ... eN|] to build arrays in extension from the values of the expressions e1, ..., eN. When all expressions are compile-time constants the compiler tries to implement this construct more efficiently, by copying a statically-allocated array, as @gasche mentioned. The equivalent for bytes would be a construct that builds a byte sequence of length N from N expressions of type char. This is not what this feature request is about. I agree with @gasche that the use cases for byte literals are probably too few to justify special syntax and semantics. It could make sense, however, to add some "mixed" bytes-and-strings operations to the Bytes module, such as "compare a bytes with a string". |
Comment author: @xclerc
I misinterpreted the request as "a construct that builds a byte sequence of length N from the N characters of the quoted literal". |
Re-reading this discussion, I am convinced that
So, I'll go ahead and close this issue. |
Original bug ID: 7797
Reporter: xvilka
Status: acknowledged (set by @xavierleroy on 2018-05-21T10:08:56Z)
Resolution: open
Priority: normal
Severity: feature
Version: 4.06.1
Category: language features
Monitored by: @nojb @gasche
Bug description
Right now OCaml has support for string literals, but with the clear distinction between bytes and strings there is a need for "bytes" type literals. For example
when you need to compare
if some_bytes_var = "\x00"
Since OCaml 4.06 this wont work anymore and Bytes.of_string "\x00" would be an overkill and unneeded copy.What I am suggesting is to introduce new type of the byte literals with syntax like b"\x00" or another.
So instead of
if some_bytes_var = (Bytes.of_string "\x00")
it would be possible to write
if some_bytes_var = b"\x00"
The text was updated successfully, but these errors were encountered: