Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring: avoid long tuples in typing/typeclass.ml by using records with named fields #7927

Closed
vicuna opened this issue Feb 20, 2019 · 9 comments

Comments

@vicuna
Copy link

vicuna commented Feb 20, 2019

Original bug ID: 7927
Reporter: @gasche
Status: acknowledged (set by @gasche on 2019-02-20T15:43:53Z)
Resolution: open
Priority: low
Severity: feature
Category: typing
Tags: junior_job
Monitored by: @nojb

Bug description

typing/typeclass.ml uses long records with many components, only some of which are accessed each time. Example of not-so-nice-looking code includes:

let extract_type_decls
    (_id, _id_loc, clty, _ty_id, cltydef, obj_id, obj_abbr, _cl_id, cl_abbr,
     _arity, _pub_meths, _coe, _expr, required) decls =
  (obj_id, obj_abbr, cl_abbr, clty, cltydef, required) :: decls

let merge_type_decls
    (id, id_loc, _clty, ty_id, _cltydef, obj_id, _obj_abbr, cl_id, _cl_abbr,
     arity, pub_meths, coe, expr, req) (obj_abbr, cl_abbr, clty, cltydef) =
  (id, id_loc, clty, ty_id, cltydef, obj_id, obj_abbr, cl_id, cl_abbr,
   arity, pub_meths, coe, expr, req)

Using records with named fields would improve the readability of this part of the codebase. Different tuples of different length are used in different places of the code, so three things may be useful:

  • trying to factorize only some parts, it's fine if some tuples remain
  • trying to group components in sub-structures that make up a coherent unit (this may require a few attempts)
  • if you end up declaring several record types with closely related fields (it's fine), you may use type-directed field disambiguation to reuse field names among the records if it makes the work easier

(Note: one could think of using object types as extensible records here, but we are trying not to use objects directly in the compiler for bootstrapping reasons, so staying with less-flexible record types is the better idea.)

@ulugbekna
Copy link
Contributor

ulugbekna commented Mar 18, 2019

Hi! I started on this issue by figuring out which functions need refactoring first. Then I put those functions and their large tuples in a table below to see overlaps. We can see that extract_type_decls, merge_type_decls, final_env, check_coercions use same shaped tuples, so they can be combined for sure. I already tried to create a record type that has that tuples' shape but am having hard time giving a proper name.

  1. What's your opinion on a such approach? My objective now is to eliminate large tuples now and reorganize/reuse records later.
var name\function name extract_type_decls merge_type_decls final_env check_coercions initial_env final_decl class_infos
cl         cl cl cl
id _id id id id id id id
id_loc _id_loc id_loc _id_loc id_loc      
clty clty _clty clty clty   clty cl_ty
ty_id _ty_id ty_id ty_id ty_id ty_id ty_id ty_id
cltydef cltydef _cltydef cltydef cltydef   cltydef  
obj_id obj_id obj_id obj_id obj_id obj_id obj_id obj_id
obj_abbr obj_abbr _obj_abbr obj_abbr obj_abbr   obj_abbr  
cl_id _cl_id cl_id cl_id cl_id cl_id cl_id cl_id
cl_abbr cl_abbr _cl_abbr cl_abbr cl_abbr   cl_abbr  
cl_params           cl_params cl_params
arity _arity arity _arity arity   arity  
pub_meths _pub_meths pub_meths _pub_meths pub_meths   pub_meths  
coe _coe coe _coe coercion_locs   coe  
expr _expr expr _expr _expr   expr  
req req req _req req      
obj_params             obj_params
obj_ty             obj_ty
constr_type             constr_type
dummy_class             dummy_class
  1. Also, is there a read on the OCaml class type system to get a better understanding of it? There are few comments in the source file.

  2. Another thing: Am I correct to assume that the goal of this refactor is to get better type inference? Because the code, seemingly, will get only larger because of aliasing record fields in arguments.

Edit: btw, going through HACKING.adoc was very helpful.

@alainfrisch
Copy link
Contributor

Answering 3, the benefits will be:

  • Documenting the meaning of each field.
  • Soft-enforcing the use of standard identifiers, thanks to punning (which makes it more tedious to use ad hoc names).
  • Possibly using the {.. with ..} syntax and dot-notation, which would be more compact than their tuple-equivalent.
  • Making future extensions/refactoring easier.

@gasche
Copy link
Member

gasche commented Mar 18, 2019

Going for the low-hanging fruit, that is the four functions you identified to use the exact same shape, and converting them into a shallow record (instead of trying to reorganize via nesting), sounds like a good first step to me.

Also, is there a read on the OCaml class type system to get a better understanding of it? There are few comments in the source file.

For how to use the object-oriented parts of the language, see for example the Real World OCaml chapter on objects. For the implementation of the type system,

For the theory behind it, it would be the academic article Objective ML: An effective object-oriented extension to ML (journal).

I don't think any of these contain enough explanations to clarify the implementation, though. In any case, the nice thing with refactoring is that we don't really need to understand finely the code we are working with :-)

@muskangarg21
Copy link
Contributor

@Octachron
Working on it.
Thanks

@muskangarg21
Copy link
Contributor

@Octachron @gasche
Just to get me started and have a better idea,

  • Let say I want to refactor val_env met_env par_env somehow, as they are present together a lot of times in the code what approach should I follow ?
  • Another tuple is fields, val_sig, concr_meths, inher only seen twice in code so should I consider refactoring it ?
  • Should I try to increase the use of with as mentioned in a comment above
  • The code isn't much intuitive to understand, how do we increase its readability afterall ?
    Thanks

@Octachron
Copy link
Member

Generally, if a tuple is reused a lot of time, it may make sense to create a record for all elements of the tuple. For instance, here

type class_env = { var: Env.t; meth: Env.t; parents:Env.t }

Then the utility of the refactoring would depend on how well the record fits in the existing context. In particular, if it often needed to deconstruct and reconstruct the record often or not.
Using with is not an aim by itself, it is more an operation that is quite readable on records and not so much with tuples.
The code is not intuitive at all, the aim of this refactoring at all is also to reduce the syntactic noise to make it clearer when something meaningful happen. For instance, spotting where par_env is not so trivial in the current code.

@muskangarg21
Copy link
Contributor

@Octachron
I tried to reduce the tuple (val_env, met_env, par_env) to the record class_env.
For now I have only refactored the code upto line 824 and it needs a lot of work,
I am opening this PR to get your views on this approach and to ask whether the work looks promising or not.
Thanks

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@gasche
Copy link
Member

gasche commented Apr 23, 2021

Note: this was in fact fixed in #8527.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants