A format specifier for bytes #6429

vicuna · 2014-05-17T05:47:45Z

Original bug ID: 6429
Reporter: @whitequark
Status: confirmed (set by @damiendoligez on 2014-05-21T15:32:19Z)
Resolution: open
Priority: normal
Severity: feature
Target version: undecided
Category: standard library
Monitored by: braibant @diml

Bug description

Without a %S-like format specifier for Bytes, debugging becomes extremely annoying--the code is littered with bogus conversions everywhere. Perhaps it's possible to provide one?

It also makes sense to provide its non-escaped equivalent as well ("%s").

vicuna · 2014-05-17T05:59:34Z

Comment author: @gasche

I have no idea of what a good syntax would be. My only idea would be to reuse "%#s" (currently considered as "%s", but planning-to-be-outlawed in 4.02 as it doesn't mean anything), and it's mediocre at best.

vicuna · 2014-05-17T06:03:40Z

Comment author: @whitequark

Well, you could disambiguate the conflict with %b/%B by using the second letter: %y/%Y, in the same way as e.g. options for Unix tools are disambiguated. This is what I would expect, at least.

vicuna · 2014-05-17T06:32:46Z

Comment author: @gasche

Note that in the meantime, we could make sure that

"%a" Bytes.print by
"%a" Bytes.to_string by

works (by adding the relevant functions if need be for some *printf function), which would already have reasonable readability.

vicuna · 2014-05-21T15:31:48Z

Comment author: @damiendoligez

While we decide which letter to use, you should use this for debugging without too much pain:

let (!!) = Bytes.unsafe_to_string;;
Printf.printf "hello %s\n" !!my_byte_sequence;;

For the letter, I think Y is a good candidate. My first idea was Z but we might want to use that for bignums at some point in the future.

As for %#s, that would make the type depend on the format's flags rather than the letter. A very bad idea.

vicuna · 2014-09-21T14:12:01Z

Comment author: @gasche

I considered implementing this, but I'm not happy with having both %y and %Y.

The problem is that the usual semantics of the big-letter version is "as written in OCaml source code", so it would seem natural that whichever output syntax is chosen for %Y also produces valid OCaml literals; but we have no literal syntax for bytes.

On the other hand, the escaped-printing behavior of %S is certainly more useful than the non-escaped behavior of %s for bytes, for the "byte sequences" applications that have no reason to stay in the printable ASCII range; so if we had only one formatter for bytes, it should probably have the semantics of %S.

vicuna · 2015-02-25T22:37:39Z

Comment author: @damiendoligez

I think the parallel with %s/%S is quite natural, so if you want to implement only one, it should be %Y...

vicuna · 2016-04-21T09:13:44Z

Comment author: @whitequark

I think implementing just %Y is a good idea.

github-actions · 2020-05-13T04:19:21Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

gasche · 2020-05-13T07:03:28Z

I believe this is still a relevant feature request.

Thinking about it again, it is not clear that there is no value for a y format that would output the literals directly -- for example if people have bytes that contain terminal escapes or things like that.
We could also support xy to print the bytes in hexadecimal.

gasche · 2020-05-13T07:05:01Z

Marking this as "newcomer job advanced": the Format machinery uses advanced types, one has to be familiar with GADTs to work on them, but then it is doable for a newcomer to add support for a new conversion by imitating the existing code.

nojb · 2020-05-13T07:36:59Z

We could also support xy to print the bytes in hexadecimal.

This would actually be quite useful.

gasche · 2020-05-13T08:11:03Z

I should point out that the suggestion is inspired by @dra27's work on #9446.

github-actions · 2021-07-21T04:26:03Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

github-actions · 2022-08-19T04:36:57Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

shindere · 2022-08-29T09:12:21Z

Okay I went through the whole conversation and since the feature felt sensible to me, I removed the Stale label and assigned it to myself, without knowing precisely when I will be able to work on it. I also agree with @gasche that being able to do litteral printing would be helpful, especially if the purpose is to output binary files, although I assume there are already other ways to achieve this. To make this job really doable for newcomers, would it be possible / easy to provide a few hints on which steps to folllow / which files to modify to implement the feature, as @gasche had so nicely done in an issue or PR about adding extension points to let operators or something like that? Sorry for not being able to be more precise on that, locating issues and PRs is at the moment not that easy for me but I'm sure somebody will see what I mean, at least @gasche himself!

XVilka · 2023-07-03T09:54:45Z

Just a food for thought, for debugging the sparse hexadecimal format might be more useful, at least it's quite useful for the reverse engineering tasks:

[0x0040bbe0]> px
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x0040bbe0  4885 f674 6b55 5348 83ec 0848 8b46 5048  H..tkUSH...H.FPH
0x0040bbf0  8b2d aa75 2200 488b 1d53 a422 0048 8905  .-.u".H..S.".H..
0x0040bc00  9c75 2200 488b 4620 4885 c074 0d48 8338  .u".H.F H..t.H.8
0x0040bc10  00ba 0000 0000 480f 44c2 4889 fe48 c7c2  ......H.D.H..H..
0x0040bc20  ffff ffff 31ff 4889 0523 a422 00e8 bef4  ....1.H..#."....
0x0040bc30  ffff 4889 2d67 7522 0048 891d 10a4 2200  ..H.-gu".H....".
0x0040bc40  4883 c408 5b5d c366 0f1f 8400 0000 0000  H...[].f........
0x0040bc50  4889 fe48 c7c2 ffff ffff 31ff e98f f4ff  H..H......1.....
0x0040bc60  ff0f 1f44 0000 662e 0f1f 8400 0000 0000  ...D..f.........
0x0040bc70  4155 4154 5553 4883 ec08 488b 2df7 7622  AUATUSH...H.-.v"
0x0040bc80  0048 8b1d f876 2200 48c7 05e5 7622 0000  .H...v".H...v"..
0x0040bc90  0000 0048 85f6 7478 488b 4650 4c8b 2dfd  ...H..txH.FPL.-.
0x0040bca0  7422 004c 8b25 a6a3 2200 4889 05ef 7422  t".L.%..".H...t"
0x0040bcb0  0048 8b46 2048 85c0 740d 4883 3800 ba00  .H.F H..t.H.8...
0x0040bcc0  0000 0048 0f44 c248 89fe 48c7 c2ff ffff  ...H.D.H..H.....
0x0040bcd0  ff31 ff48 8905 76a3 2200 e811 f4ff ff4c  .1.H..v."......L
[0x0040bbe0]> pxi
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  40bbe0: .H 85 f6 .t .k .U .S .H 83 ec 08 .H 8b .F .P .H
  40bbf0: 8b .- aa .u ."    .H 8b 1d .S a4 ."    .H 89 05
  40bc00: 9c .u ."    .H 8b .F .  .H 85 c0 .t 0d .H 83 .8
  40bc10:    ba             .H 0f .D c2 .H 89 fe .H c7 c2
  40bc20: ## ## ## ## .1 ## .H 89 05 .# a4 ."    e8 be f4
  40bc30: ## ## .H 89 .- .g .u ."    .H 89 1d 10 a4 ."
  40bc40: .H 83 c4 08 .[ .] c3 .f 0f 1f 84
  40bc50: .H 89 fe .H c7 c2 ## ## ## ## .1 ## e9 8f f4 ##
  40bc60: ## 0f 1f .D       .f .. 0f 1f 84
  40bc70: .A .U .A .T .U .S .H 83 ec 08 .H 8b .- f7 .v ."
  40bc80:    .H 8b 1d f8 .v ."    .H c7 05 e5 .v ."
  40bc90:          .H 85 f6 .t .x .H 8b .F .P .L 8b .- fd
  40bca0: .t ."    .L 8b .% a6 a3 ."    .H 89 05 ef .t ."
  40bcb0:    .H 8b .F .  .H 85 c0 .t 0d .H 83 .8    ba
  40bcc0:          .H 0f .D c2 .H 89 fe .H c7 c2 ## ## ##
  40bcd0: ## .1 ## .H 89 05 .v a3 ."    e8 11 f4 ## ## .L
  40bce0 ]
[0x0040bbe0]>

The first is the "normal" hex, the second is the sparse "hexII" format invented by Ange Albertini.

See more information at https://speakerdeck.com/ange/no-more-dumb-hex

vicuna added stdlib feature-wish labels Mar 14, 2019

github-actions bot added the Stale label May 13, 2020

gasche added newcomer-job-advanced and removed Stale labels May 13, 2020

github-actions bot added the Stale label Jul 21, 2021

damiendoligez removed the Stale label Aug 17, 2021

wyn mentioned this issue Nov 24, 2021

[WIP] Added format specifier %y for bytes #10791

Open

github-actions bot added the Stale label Aug 19, 2022

shindere removed the Stale label Aug 29, 2022

shindere self-assigned this Aug 29, 2022

shindere removed their assignment Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A format specifier for bytes #6429

A format specifier for bytes #6429

vicuna commented May 17, 2014

vicuna commented May 17, 2014

vicuna commented May 17, 2014

vicuna commented May 17, 2014

vicuna commented May 21, 2014

vicuna commented Sep 21, 2014

vicuna commented Feb 25, 2015

vicuna commented Apr 21, 2016

github-actions bot commented May 13, 2020

gasche commented May 13, 2020 •

edited

gasche commented May 13, 2020

nojb commented May 13, 2020

gasche commented May 13, 2020

github-actions bot commented Jul 21, 2021

github-actions bot commented Aug 19, 2022

shindere commented Aug 29, 2022 via email

XVilka commented Jul 3, 2023 •

edited

A format specifier for bytes #6429

A format specifier for bytes #6429

Comments

vicuna commented May 17, 2014

Bug description

vicuna commented May 17, 2014

vicuna commented May 17, 2014

vicuna commented May 17, 2014

vicuna commented May 21, 2014

vicuna commented Sep 21, 2014

vicuna commented Feb 25, 2015

vicuna commented Apr 21, 2016

github-actions bot commented May 13, 2020

gasche commented May 13, 2020 • edited

gasche commented May 13, 2020

nojb commented May 13, 2020

gasche commented May 13, 2020

github-actions bot commented Jul 21, 2021

github-actions bot commented Aug 19, 2022

shindere commented Aug 29, 2022 via email

XVilka commented Jul 3, 2023 • edited

gasche commented May 13, 2020 •

edited

XVilka commented Jul 3, 2023 •

edited