New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate Stream and Genlex #6711
Comments
Comment author: @gasche I think this is a good idea. |
Comment author: @damiendoligez +1, but not on a minor release. Let's do it for 4.03. |
Comment author: @garrigue I'm using stream parsers in lablgtk, and the plan is to provide a preprocessed version of the source file, to allows compilation without camlp4. However, if we remove Stream and Genlex, this becomes impossible, since the preprocessed code will still depend on them. |
Comment author: @whitequark @garrigue, since lablgtk is an external library, it could simply depend on Stream/Genlex as another library. This will even be seamless with OPAM. |
Comment author: @johnwhitington Whilst I use Genlex, I think deprecating it in the greater cause of getting rid of Stream is fine. When I use Genlex, it's normally for little parsers for simple file formats where it would be too tedious to define one's own lexer and parser, and for which the lexical conventions of the file format happen to coincide with OCaml's. For example, this parser for Adobe Font Metrics files: https://github.com/johnwhitington/camlpdf/blob/master/pdfafm.ml I normally load a line at a time from the file, so the incremental nature of streams is not really being taken advantage of. It would be easy to write a simple Genlex module for OPAM which just knows how to lex from strings (this is what I will do if Genlex is removed), as well as keeping in OPAM a Stream version of Genlex identical to the current one for compatibility. |
Comment author: @alainfrisch "ocaml.deprecated" attributes can now be specified on compilation units (in the form of a floating attribute [@@@ocaml.deprecated "..."] before any other non-attribute item in the .mli). This would allow to mark Stream and Genlex, by undoing: @doligez You proposed to do it for 4.03. Do you confirm? |
Comment author: @alainfrisch Ping? Has anyone an opinion on whether we should mark these modules as deprecated now? |
Comment author: @gasche I worry that we lack an alternative to propose to Genlex users. Even for Stream, what would be recommend using instead? There are several libraries out there with various kind of lazy streams, so I'm sure people (if a bit puzzled at first) will find something they like (for example Simon's iterators+generators). For Genlex, what is our replacement solution? Build something on top of sedlex? Could we maybe say that we will actively deprecate it as soon as there exists another library we can recommend our users, that is adequately documented (I mean there are usage examples comparable in scope to what Genlex does easily)? |
Comment author: @alainfrisch @Drup mentions that Genlex is mostly useful for camlp4 users. But both @garrigue and @johnwhitington reported uses outside of this universe.
One would need to check with Xavier that moving these two modules to an external package would be ok, license-wise. Assuming this is ok, would someone volunteer to do it? |
Comment author: @gasche As long as we keep the modules in the standard library (which we should for obvious compatibility reason), an external package can be as simple as: include Genlex[@warning "-3"] and in particular there are no licensing needs. However, I'm not sure what is the purpose of deprecating Genlex if it is to encourage users to use instead the same module as an external dependency. I would vote to move the target version of this bug to 4.04. We have enough deprecated warnings already, and the dynamics of deprecation are not very well understood yet, maybe we can wait six more months to add more oil on the grill. |
Comment author: @alainfrisch Agreed. |
Comment author: @whitequark
Same as with camlp4: deprecate with intent to remove. No? |
Comment author: @gasche My understanding of this deprecation request is a bit different from Camlp4. The point is not maintainers hoping to decrease maintenance load (that's just an extra: it is true that Stream is a difficult codebase, but it hasn't moved much in years), but rather the idea that the implementation is too complex for its own good and that most users would probably be better advised to pick something else. Whether or not it remains in the compiler distribution is orthogonal to the question of which alternative to chose if we decide to indeed deprecate it¹. ¹: another approach would be to say that it has worked for people for years and the fact that less convenient but more solution may exist is not strong enough a reason to deprecate. I think there is a difference between "we would not advise new users to start writing code using this tool" (indeed, I would rather encourage new users to look at sedlex+menhir for their parsing needs, and whatever lazy list library they like instead of Stream) and "we want to show tons of painful warnings to existing codebases using this to force users to move to something else". I think that Stream would be a fine target for the first option, but it is not clear to me that the second is warranted. Given that adjustable deprecation warnings did not exist at the time Drup made his feature request, it is not clear which of the two variants he had in mind. It might be simply a case of submitting a pull request against Stream and Genlex .mli documentation, to add a sentence at the top saying "we discourage these modules for new uses, rather have a look at libraries Foo and Blah instead", and we're done. |
Comment author: @Drup I clearly had the first version in mind, but I'm not sure an addition to the documentation wouldn't be just ignored by every user. :) |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
Still an issue. |
Random thought: what about getting rid of |
I would still advocate for a solution where existing stream parsers still work (maybe using a small external library). On the other hand, having a ppx frontend to stream parsers would be even better for me, removing the dependency on camlp5 for preprocessing. Small syntactic changes maybe ok, but please, do not force using a different semantics for parsers. |
Genlex is a generic lexer, not a lexer generator, so it's not really comparable with ocamllex and sedlex, even modulo a ppx syntax. Genlex is useful to provide a lexer for small languages that have roughly the same lexical conventions as OCaml. These can be small languages for teaching (genlex's original uses), or DSLs. The current interface uses streams, because it was designed to work with stream parsers (from Caml Light, then provided by Camlp4/Camlp5), but we could add interfaces to produce sequences (type This doesn't mean that Genlex must absolutely be in the standard library; just that it is useful independently of Stream and of Camlp4/Camlp5. On the contrary, Stream has essentially no legitimate uses outside of Camlp4/Camlp5 stream parsers. The imperative nature is questionable, the implementation is tricky, and a simple sequence or lazy list is generally preferable. |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
camlp-streams and #10482 are the first step in fixing this. |
Addressed by #10482. |
Original bug ID: 6711
Reporter: @Drup
Status: acknowledged (set by @gasche on 2014-12-13T23:35:30Z)
Resolution: open
Priority: normal
Severity: feature
Target version: undecided
Category: standard library
Related to: #7064
Monitored by: @whitequark @Drup @gasche @yallop @dbuenzli @yakobowski @alainfrisch
Bug description
I propose to deprecate the Stream library, here are some reasons:
The implementation is complex, not very well documented, uses Obj.magic, Obj.set_fields and other arcane functions.
Most complementary functions provided on top of streams are implemented using functions that are considered "internal" (see https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/batStream.mlv). It's a sign that the public interface is not enough and one must use internal functions to do interesting things with steams.
It's not even a good implementation (slow, delicate semantics due to over-complicated implementation, lack of auxiliary functions and difficulties to implement them due to point 2).
It was used in camlp4, but it's not used at all in the compiler now (afaict).
There are much better alternatives (gen, BatSeq, Core.Sequence ...) that are now easily accessible, cover the same needs and with a simpler semantic. It should be advertised to newcomers that Stream is not a very good solution.
(About the "slow" argument, someone asked for details: I did a performance comparison some times ago. It was not intended as a very clean and general purpose benchmark, but it gives an idea of how implementation fares, and Stream doesn't fare well, to say the least. See this thread https://lists.forge.ocamlcore.org/pipermail/batteries-devel/2014-January/001917.html)
This also means deprecating Genlex. The documentation still mentions camlp4 and I think the module is pretty useless without it. As with Stream, there are other, better ways to define tokenizers (ocamllex, sedlex, ...). It could be put outside the compiler if need be, with its camlp4 extension.
The text was updated successfully, but these errors were encountered: