Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Stream and Genlex #6711

Closed
vicuna opened this issue Dec 12, 2014 · 22 comments
Closed

Deprecate Stream and Genlex #6711

vicuna opened this issue Dec 12, 2014 · 22 comments

Comments

@vicuna
Copy link

vicuna commented Dec 12, 2014

Original bug ID: 6711
Reporter: @Drup
Status: acknowledged (set by @gasche on 2014-12-13T23:35:30Z)
Resolution: open
Priority: normal
Severity: feature
Target version: undecided
Category: standard library
Related to: #7064
Monitored by: @whitequark @Drup @gasche @yallop @dbuenzli @yakobowski @alainfrisch

Bug description

I propose to deprecate the Stream library, here are some reasons:

  1. The implementation is complex, not very well documented, uses Obj.magic, Obj.set_fields and other arcane functions.

  2. Most complementary functions provided on top of streams are implemented using functions that are considered "internal" (see https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/batStream.mlv). It's a sign that the public interface is not enough and one must use internal functions to do interesting things with steams.

  3. It's not even a good implementation (slow, delicate semantics due to over-complicated implementation, lack of auxiliary functions and difficulties to implement them due to point 2).

  4. It was used in camlp4, but it's not used at all in the compiler now (afaict).

  5. There are much better alternatives (gen, BatSeq, Core.Sequence ...) that are now easily accessible, cover the same needs and with a simpler semantic. It should be advertised to newcomers that Stream is not a very good solution.

(About the "slow" argument, someone asked for details: I did a performance comparison some times ago. It was not intended as a very clean and general purpose benchmark, but it gives an idea of how implementation fares, and Stream doesn't fare well, to say the least. See this thread https://lists.forge.ocamlcore.org/pipermail/batteries-devel/2014-January/001917.html)

This also means deprecating Genlex. The documentation still mentions camlp4 and I think the module is pretty useless without it. As with Stream, there are other, better ways to define tokenizers (ocamllex, sedlex, ...). It could be put outside the compiler if need be, with its camlp4 extension.

@vicuna
Copy link
Author

vicuna commented Dec 13, 2014

Comment author: @gasche

I think this is a good idea.

@vicuna
Copy link
Author

vicuna commented Jan 9, 2015

Comment author: @damiendoligez

+1, but not on a minor release. Let's do it for 4.03.

@vicuna
Copy link
Author

vicuna commented Jan 17, 2015

Comment author: @garrigue

I'm using stream parsers in lablgtk, and the plan is to provide a preprocessed version of the source file, to allows compilation without camlp4. However, if we remove Stream and Genlex, this becomes impossible, since the preprocessed code will still depend on them.
Just wanted to mention it.

@vicuna
Copy link
Author

vicuna commented Jan 17, 2015

Comment author: @whitequark

@garrigue, since lablgtk is an external library, it could simply depend on Stream/Genlex as another library. This will even be seamless with OPAM.

@vicuna
Copy link
Author

vicuna commented Aug 25, 2015

Comment author: @johnwhitington

Whilst I use Genlex, I think deprecating it in the greater cause of getting rid of Stream is fine.

When I use Genlex, it's normally for little parsers for simple file formats where it would be too tedious to define one's own lexer and parser, and for which the lexical conventions of the file format happen to coincide with OCaml's. For example, this parser for Adobe Font Metrics files:

https://github.com/johnwhitington/camlpdf/blob/master/pdfafm.ml

I normally load a line at a time from the file, so the incremental nature of streams is not really being taken advantage of.

It would be easy to write a simple Genlex module for OPAM which just knows how to lex from strings (this is what I will do if Genlex is removed), as well as keeping in OPAM a Stream version of Genlex identical to the current one for compatibility.

@vicuna
Copy link
Author

vicuna commented Dec 2, 2015

Comment author: @alainfrisch

"ocaml.deprecated" attributes can now be specified on compilation units (in the form of a floating attribute [@@@ocaml.deprecated "..."] before any other non-attribute item in the .mli). This would allow to mark Stream and Genlex, by undoing:

alainfrisch@82702c6

@doligez You proposed to do it for 4.03. Do you confirm?

@vicuna
Copy link
Author

vicuna commented Dec 9, 2015

Comment author: @alainfrisch

Ping? Has anyone an opinion on whether we should mark these modules as deprecated now?

@vicuna
Copy link
Author

vicuna commented Dec 9, 2015

Comment author: @gasche

I worry that we lack an alternative to propose to Genlex users. Even for Stream, what would be recommend using instead? There are several libraries out there with various kind of lazy streams, so I'm sure people (if a bit puzzled at first) will find something they like (for example Simon's iterators+generators). For Genlex, what is our replacement solution? Build something on top of sedlex?

Could we maybe say that we will actively deprecate it as soon as there exists another library we can recommend our users, that is adequately documented (I mean there are usage examples comparable in scope to what Genlex does easily)?

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @alainfrisch

@Drup mentions that Genlex is mostly useful for camlp4 users. But both @garrigue and @johnwhitington reported uses outside of this universe.

It would be easy to write a simple Genlex module for OPAM which just knows how > to lex from strings (this is what I will do if Genlex is removed), as well as
keeping in OPAM a Stream version of Genlex identical to the current one for
compatibility.

One would need to check with Xavier that moving these two modules to an external package would be ok, license-wise. Assuming this is ok, would someone volunteer to do it?

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @gasche

As long as we keep the modules in the standard library (which we should for obvious compatibility reason), an external package can be as simple as:

include Genlex[@warning "-3"]

and in particular there are no licensing needs.

However, I'm not sure what is the purpose of deprecating Genlex if it is to encourage users to use instead the same module as an external dependency.

I would vote to move the target version of this bug to 4.04. We have enough deprecated warnings already, and the dynamics of deprecation are not very well understood yet, maybe we can wait six more months to add more oil on the grill.

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @alainfrisch

Agreed.

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @whitequark

However, I'm not sure what is the purpose of deprecating Genlex if it is to encourage users to use instead the same module as an external dependency.

Same as with camlp4: deprecate with intent to remove. No?

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @gasche

My understanding of this deprecation request is a bit different from Camlp4. The point is not maintainers hoping to decrease maintenance load (that's just an extra: it is true that Stream is a difficult codebase, but it hasn't moved much in years), but rather the idea that the implementation is too complex for its own good and that most users would probably be better advised to pick something else. Whether or not it remains in the compiler distribution is orthogonal to the question of which alternative to chose if we decide to indeed deprecate it¹.

¹: another approach would be to say that it has worked for people for years and the fact that less convenient but more solution may exist is not strong enough a reason to deprecate.

I think there is a difference between "we would not advise new users to start writing code using this tool" (indeed, I would rather encourage new users to look at sedlex+menhir for their parsing needs, and whatever lazy list library they like instead of Stream) and "we want to show tons of painful warnings to existing codebases using this to force users to move to something else". I think that Stream would be a fine target for the first option, but it is not clear to me that the second is warranted.

Given that adjustable deprecation warnings did not exist at the time Drup made his feature request, it is not clear which of the two variants he had in mind. It might be simply a case of submitting a pull request against Stream and Genlex .mli documentation, to add a sentence at the top saying "we discourage these modules for new uses, rather have a look at libraries Foo and Blah instead", and we're done.

@vicuna
Copy link
Author

vicuna commented Dec 11, 2015

Comment author: @Drup

I clearly had the first version in mind, but I'm not sure an addition to the documentation wouldn't be just ignored by every user. :)

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 11, 2020
@whitequark
Copy link
Member

Still an issue.

@nojb
Copy link
Contributor

nojb commented May 11, 2020

Random thought: what about getting rid of Genlex and instead bringing ocamllex into the standard library (with a suitable ppx syntax, like sedlex) ?

@garrigue
Copy link
Contributor

I would still advocate for a solution where existing stream parsers still work (maybe using a small external library).
I do not see what would justify forcing people to rewrite their code in a different library.

On the other hand, having a ppx frontend to stream parsers would be even better for me, removing the dependency on camlp5 for preprocessing. Small syntactic changes maybe ok, but please, do not force using a different semantics for parsers.

@xavierleroy
Copy link
Contributor

Genlex is a generic lexer, not a lexer generator, so it's not really comparable with ocamllex and sedlex, even modulo a ppx syntax.

Genlex is useful to provide a lexer for small languages that have roughly the same lexical conventions as OCaml. These can be small languages for teaching (genlex's original uses), or DSLs. The current interface uses streams, because it was designed to work with stream parsers (from Caml Light, then provided by Camlp4/Camlp5), but we could add interfaces to produce sequences (type token seq), for use with hand-written recursive-descent parsers; perhaps even an interface to work with Menhir-generated parsers.

This doesn't mean that Genlex must absolutely be in the standard library; just that it is useful independently of Stream and of Camlp4/Camlp5.

On the contrary, Stream has essentially no legitimate uses outside of Camlp4/Camlp5 stream parsers. The imperative nature is questionable, the implementation is tricky, and a simple sequence or lazy list is generally preferable.

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Jun 18, 2021
@damiendoligez
Copy link
Member

camlp-streams and #10482 are the first step in fixing this.

@github-actions github-actions bot removed the Stale label Jul 2, 2021
@xavierleroy
Copy link
Contributor

Addressed by #10482.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants