New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ocaml REPL does not truncate value of type 'bytes' and 'string' #7127
Comments
Comment author: @gasche Indeed: if you do String.init 10_000 (fun i -> char_of_int (i mod 26 + int_of_char 'A'));; you get a rather long string shown in the toplevel, whereas Array.init 10_000 (fun i -> char_of_int (i mod 26 + int_of_char 'A'));;
cuts the array representation after the first few elements. It would be consistent to summarize the string using the same threshold as well. |
Comment author: junsli I am working on this bug now. |
Comment author: junsli Hi, How do you think this patch? The ellipsis is printed for Array because iteration reaches the max (max_printer_steps). And toplevel prints string using print_string. This patch refactors max_printer_steps, and defines max_string_length to be the threshold for both. Thanks. |
Comment author: @gasche The patch is ok, but it is a bit frustrating that you have to plumb the steps parameter this way. Another thing you could try is to remove the string printer from the list of default printers, and instead have it be another case of the catch-all case where Array is already handled, and the "steps" information in accessible. Would you be motivated enough to maybe try this as well, so that it is possible to compare the two approaches? (One advantage of the latter is that the information would take into account the steps already taken, not just a global fixed constant.) While reading your patch and the codebase, I wondered how bytes is handled when -safe-string is set. Do you know what happens? How come we don't need to have separate printers for string and bytes? |
Comment author: junsli Thanks gasche. I agree with you; I feel...odd about this solution. I'll try your suggestion. I have never used -safe-string before. Top-level gives me on Bytes with -safe-string. I'll see what's going on here. |
Comment author: @gasche Indeed, bytes are not printed when -safe-string is passed, because the type-directed lookup does not find anything for them. Whatever solution we end up choosing for restricting string size, it would be nice to have a second commit to support Bytes printing as well. With the current patch, that would just mean adding another simple printer just for bytes; if the suggested approach works, you can test for the equality of the type path against either Predef.ident_string or Predef.ident_bytes. |
Comment author: @gasche (Oh, and while you are at it, if your pseudonym is linked to a full name anyway, then it would be better to use your full name as an author name in the commit messages; you can use "git commit --amend --author=" to amend a commit's author name.) |
Comment author: junsli Thank you! I'll report back. |
Comment author: junsli OK. Simple fix. I simply remove the default string printer and print strings (and bytes) when it fails to find the default one. Without -safe-string, bytes is just an alias of string types, so printing worked before. With -safe-string, as a new type, bytes did not have a default printer, so it was printed as . To replace the literal "..." with Oval_ellipsis, we have to either change type Oval_string to "Oval_string of Oval_value list" or "Oval_string of string * bool" (bool indicates whether string is truncated) to encode the Oval_ellipsis. I think they are over designed and also put unnecessary overhead during constructing and printing Oval_string values. what do you think? |
Comment author: junsli Two concerns.
|
Comment author: @gasche I think the second patch is better because it is simpler. I wouldn't worry about Oval_ellipsis, as the solutions are too complex, but it could be nice to have a comment at this point to explain why "..." is used instead.
I would guess that it's the same argument as for unlimited printing in the general case: if the output is too long, it becomes harder to read.
There would not be a memory leak with the current implementation, but you could observe a difference if the bytes got mutated between this step and the final printing step. I like the fact of having two different code paths instead of relying on the same-representation detail to get shorter code. However, the current way you first Bytes.to_string and then String.sub is suboptimal (it will copy the bytes data in full, to sub it later), I think it would make sense to reduplicate the length-testing logic and use Bytes.sub_string in the bytes path. |
Comment author: junsli Thanks for your comment Gabriel. The change breaks tests/typing-extension-constructor/test.ml. I saw your editing history on this file. You may know what's going on. The test won't fail as long as strings are printed using print_string. The failing test has nothing to do with the truncation or bytes.
Agreed. Awesome. I'll update the code accordingly, and send a pull request on github. Thanks for your time. |
Comment author: junsli For future reference: #454. |
Comment author: @damiendoligez |
Original bug ID: 7127
Reporter: xyh
Assigned to: @lefessan
Status: resolved (set by @gasche on 2017-06-04T22:51:23Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 4.02.3
Fixed in version: 4.06.0 +dev/beta1/beta2/rc1
Category: toplevel
Tags: patch, junior_job
Monitored by: junsli @gasche @diml @hcarty
Bug description
ocaml REPL does not truncate value of type 'bytes' and 'string'
File attachments
The text was updated successfully, but these errors were encountered: