ocamlyacc with friendly dollar-symbols

From: Thierry Bravier (thierry.bravier@dassault-aviation.fr)
Date: Fri Dec 12 1997 - 15:36:27 MET


Message-Id: <34914BEB.3940@dassault-aviation.fr>
Date: Fri, 12 Dec 1997 15:36:27 +0100
From: Thierry Bravier <thierry.bravier@dassault-aviation.fr>
To: caml-list@inria.fr
Subject: ocamlyacc with friendly dollar-symbols

Dear ocamlers,

Here is a small extension to ocamlyacc.

It comes as a short patch to file reader.c in the standard
distribution of ocamlyacc.

This extension provides two features:

1/ # line "file.mly" generated in file.ml

   This is useful when the ML code written in the header or trailer
   or in the semantic actions cannot compile: the error message now
   reports the problem in the source not in the generated code.

2/ In semantic actions, dollar-number values are still there ( ie $1 );
   they are now completed by dollar-symbol values ( ie $expr ).

   This convenience makes code clearer and more robust relatively to
   grammar evolutions.

   The syntax is easy:
      `$symbol' refers to `symbol' in the right hand side of the rule
         provided there is only one occurence of `symbol' which is
         most often true and is checked.
      `$symbol#i' (where i is a positive integer literal) refers to the
         i-th occurrence of `symbol' in the rule.
      `$i' (where i is a positive integer literal) still refers to the
         i-th symbol of the rule.

   BUG: Error messages should be adapted to the new
`$symbol#i'/`$symbol'
   feature.

   SEE ALSO: This feature is provided in SML/NJ .grm syntax (equivalent
   of .mly files) and can also be found in camlp4 (where semantic values
   can even be named with free identifiers).

   EXAMPLE:
   Here is the modified calc parser from the ocaml user manual:
   /* File parser.mly */
   %token <int> INT
   %token PLUS MINUS TIMES DIV
   %token LPAREN RPAREN
   %token EOL
   %left PLUS MINUS /* lowest precedence */
   %left TIMES DIV /* medium precedence */
   %nonassoc UMINUS /* highest precedence */
   %start main /* the entry point */
   %type <int> main
   %%
   main:
       expr EOL { $expr }
   ;
   expr:
       INT { $INT }
     | LPAREN expr RPAREN { $expr }
     | expr PLUS expr { $expr#1 + $expr#2 }
     | expr MINUS expr { $expr#1 - $expr#2 }
     | expr TIMES expr { $expr#1 * $expr#2 }
     | expr DIV expr { $expr#1 / $expr#2 }
     | MINUS expr %prec UMINUS { - $expr }
   ;

Hope you find it useful.

Thierry Bravier Dassault Aviation - DGT / DPR / DESA
78, Quai Marcel Dassault F-92214 Saint-Cloud Cedex - France
Telephone : (33) 01 47 11 53 07 Telecopie : (33) 01 47 11 52 83
E-Mail : mailto:thierry.bravier@dassault-aviation.fr

Here is a diff output of the modified reader.c file from version
/* $Id: reader.c,v 1.13 1997/11/13 09:57:52 xleroy Exp $ */
in the 1.06 distribution.

By default both features are enabled, they can be switched off by
compiling with -DNO_SOURCE_LOCATION and/or -DNO_DOLLAR_IDENT.

% diff -C 1 reader.distrib.c reader.c
*** reader.distrib.c Thu Nov 13 10:57:52 1997
--- reader.c Fri Dec 12 14:31:43 1997
***************
*** 49,51 ****
--- 49,55 ----
  
+ #ifndef NO_SOURCE_LOCATION
+ char line_format[] = "\n# %d \"%s\"\n";
+ #else
  char line_format[] = "(* Line %d, file %s *)\n";
+ #endif
  
***************
*** 322,323 ****
--- 326,330 ----
  
+ #ifndef NO_SOURCE_LOCATION
+ fprintf(f, line_format, lineno+1, input_file_name);
+ #endif
      if (*cptr == '\n')
***************
*** 1187,1189 ****
--- 1194,1276 ----
  
+ #ifndef NO_DOLLAR_IDENT
+ int
+ get_ident(int n)
+ {
+ register int c = *cptr;
+ char *t_cptr = cptr;
  
+ if (IS_IDENT(c))
+ {
+ cinc = 0;
+ for (;;) {
+ if (IS_IDENT(c)) {
+ cachec(c);
+ } else
+ break;
+ c = *++cptr;
+ }
+ cachec(NUL);
+ {
+ int sharp;
+ char *ident = MALLOC(strlen (cache) + 1);
+ strcpy(ident, cache);
+ if (cptr[0] == '#' && isdigit(cptr[1])) {
+ register int c;
+ ++cptr;
+ sharp = 0;
+ for (c = *cptr; isdigit(c); c = *++cptr)
+ sharp = 10*sharp + (c - '0');
+ } else {
+ sharp = 0;
+ }
+ {
+ int i;
+ bucket *item = 0;
+ if (sharp) {
+ int s = 0;
+ for (i = 1; i <= n; i++) {
+ item = pitem[nitems + i - n - 1];
+ if (!strcmp (ident, item->name)) {
+ if (++s == sharp) {
+ /* found sharp-th ident occurrence */
+ break;
+ }
+ }
+ }
+ } else {
+ int found = 0;
+ for (i = 1; i <= n; i++) {
+ item = pitem[nitems + i - n - 1];
+ if (!strcmp (ident, item->name)) {
+ if (found) {
+ /* more than one name matches ident:
error */
+ found = 0;
+ break;
+ } else {
+ /* first time a name matches ident */
+ found = i;
+ }
+ }
+ }
+ if (found) {
+ i = found;
+ item = pitem[nitems + i - n - 1];
+ } else {
+ i = 0;
+ }
+ }
+ if (i <= 0 || i > n)
+ unknown_rhs(i);
+ if (item->class == TERM && !item->tag)
+ illegal_token_ref(i, item->name);
+ FREE (ident);
+ return i;
+ }
+ }
+ }
+ syntax_error(lineno, line, t_cptr);
+ /*NOTREACHED*/
+ return 0;
+ }
+ #endif
  void copy_action(void)
***************
*** 1190,1191 ****
--- 1277,1281 ----
  {
+ #ifndef NO_SOURCE_LOCATION
+ int start_column = cptr - line + 1;
+ #endif
      register int c;
***************
*** 1227,1228 ****
--- 1317,1327 ----
      fprintf(f, "\tObj.repr((");
+ #ifndef NO_SOURCE_LOCATION
+ fprintf(f, line_format, lineno+1, input_file_name);
+ {
+ int i;
+ for (i = 0; i < start_column; i++) {
+ fprintf(f, " ");
+ }
+ }
+ #endif
  
***************
*** 1248,1249 ****
--- 1347,1356 ----
        }
+ #ifndef NO_DOLLAR_IDENT
+ if (isalnum(cptr[1]))
+ {
+ ++cptr;
+ fprintf(f, "dollar__%d", get_ident (n));
+ goto loop;
+ }
+ #endif
      }

-- 
Thierry Bravier                     Dassault Aviation - DGT / DPR / DESA
78, Quai Marcel Dassault              F-92214 Saint-Cloud Cedex - France
Telephone : (33) 01 47 11 53 07          Telecopie : (33) 01 47 11 52 83
E-Mail :                     mailto:thierry.bravier@dassault-aviation.fr



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:13 MET