New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer.add_channel is unusable #6719
Comments
Comment author: gerd I'd say Buffer.add_channel isn't flawed because it raises an exception for a short read. There are numerous protocols where the number of bytes are prepended to payload data, and add_channel works fine for these. A short read is always an error in this case. The problem reported by c-cube sounds more like a feature request for a second channel reading function that also appends short reads to the buffer. |
Comment author: @damiendoligez I'd say it's a good feature request. Discarding data is almost never the right thing to do. Here is what I propose: change the documentation to say "reads at least [n] characters" instead of "reads exactly [n] characters", and change the implementation to put the bytes in the buffer instead of discarding them. But still raise End_of_file if there aren't enough bytes. |
Comment author: @c-cube sounds ok. I'll submit a PR. |
Comment author: @hcarty How would you identify the bytes making up the partial request? It may be necessary to specially handle an incomplete read. |
Comment author: @gasche Another option would be to add add_channel_full or add_full_channel that does what c-cube wants in the first place. It can be done instead of changing add_channel, or in conjunction with it. hcarty: if you know you need this, it's possible to workaround the fact that add_channel cannot return the number of read bytes by storing the length of the buffer before and after the request, and computing the difference. What I like with Damien's suggestion is that it strikes a subtle balance between preserving backward-compatibility (the behavior is almost unchanged) while satisfying a new use-case. I agree (with hcarty) that if we were doing it from scratch, we would probably have a different interface¹, but there is an art in evolving APIs that is admirable here. ¹: namely, "add_channel" that returns the number of bytes read, and raises End_of_file only if it knows the channel ends, and "add_channel_really" with the current interface. |
Comment author: @hcarty gasche: That's true - I was hoping for something cleaner, but as you said there is an art to API evolution. |
Comment author: @damiendoligez FTR I wrote "at least" when I meant "at most". You did it right in the GPR (#129). |
Comment author: @gasche Fixed using Damien's proposed behavior, thanks to c-cube's patch. |
Original bug ID: 6719
Reporter: @c-cube
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2016-12-07T10:47:32Z)
Resolution: fixed
Priority: normal
Severity: feature
Fixed in version: 4.03.0+dev / +beta1
Category: standard library
Tags: junior_job
Has duplicate: #7136
Monitored by: @hcarty
Bug description
Buffer.add_channel behavior is flawed. In case the channel contains less than [n] bytes (where [n] is the last parameter to [Buffer.add_channel buf chan n]), some data is consumed but not read, and lost forever if the channel isn't seekable.
This makes it almost unusable in my opinion (and it looks like it's not used in trunk, as "grep -r Buffer.add_channel" shows).
Steps to reproduce
1/ echo "hello" > /tmp/some_file
2/ utop
let buf = Buffer.create 256 in
let ic = open_in "/tmp/some_file" in
(try Buffer.add_channel buf ic 256 with End_of_file -> ());
Buffer.contents buf
We obtain an empty string. In practice, unless we need how much bytes to read exactly beforehand, we just consume data from the in_channel and discard it. It makes writing a simple procedure to read all data from a channel into a Buffer impossible (except by reading char by char).
The text was updated successfully, but these errors were encountered: