Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String.compare not specialized #4965

Closed
vicuna opened this issue Jan 25, 2010 · 5 comments
Closed

String.compare not specialized #4965

vicuna opened this issue Jan 25, 2010 · 5 comments
Assignees
Labels

Comments

@vicuna
Copy link

vicuna commented Jan 25, 2010

Original bug ID: 4965
Reporter: @mmottl
Assigned to: @mshinwell
Status: closed (set by @xavierleroy on 2015-07-24T08:39:02Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 3.11.2
Fixed in version: 4.00.0
Category: ~DO NOT USE (was: OCaml general)
Monitored by: rdouglass @ygrek dvaillancourt furuse yminsky @Chris00 pzimmer @mmottl

Bug description

Unlike e.g. Int32.compare, the String.compare function calls the polymorphic comparison function instead of the one specialized for strings. This can be fixed quickly by adding a type annotation.

@vicuna
Copy link
Author

vicuna commented Jan 27, 2010

Comment author: @mmottl

Btw., it seems that optimizations are not performed even in the presence of type annotations unless the function is eta expanded, i.e. function parameters with type annotations have to be specified explicitly. There doesn't seem to be any good reason why these kinds of optimizations couldn't be done otherwise so we hope this could be fixed, too.

Note that this also extends to the use of higher-order functions. E.g. in this example "compare" will not be optimized to compare integers:

let foo lst = List.fold_left compare 0 lst

But here it will be:

let foo lst = List.fold_left (fun a b -> compare a b) 0 lst

@vicuna
Copy link
Author

vicuna commented Mar 30, 2010

Comment author: @mshinwell

I have done some quick experiments on an x86-64 Intel box and it seems to be marginally slower to add the type annotation and hence use caml_string_compare directly. I think this is probably due to the fact that it uses a different comparison method from compare_val. The former uses memcmp, which on this architecture is I believe expanded by gcc into a repz cmpsb instruction, whereas compare_val uses a simple loop that is not transformed in this way. Do you have performance numbers that indicate any significant performance increase?

I think any optimization in this area that yields anything other than marginal results with high variance across architectures is going to need more careful consideration. One thing that does come to mind is that it might be possible to significantly speed up string equality checking by always padding strings to some boundary (which will admittedly waste a little space) and then using vector instructions to test equality.

@vicuna
Copy link
Author

vicuna commented Mar 30, 2010

Comment author: @mmottl

I guess the comparison method could then be improved, too. String.compare could be implemented to use the fastest generic function available on a platform. This may not necessarily be the best for most use cases (e.g. small strings), but unless we have a statistical study of what works best for most people, this may be a good start.

Concerning padding: strings always occupy full machine words so equality checking should be trivial. The value header stores the number of words, the last word stores a "correction" to words*word_size in its last byte that needs to be subtracted. Since all OCaml strings are NULL-terminated, this always works out. I think the only thing that may need to be changed to make your proposed equality check work directly is to zero out the last word in the string before setting the correction (in caml_alloc_string). Or just check all words but the last one, and handle that one separately.

@vicuna
Copy link
Author

vicuna commented Mar 30, 2010

Comment author: @mshinwell

This is still pretty hypothetical, but it might be that you want to use 128-bit vector operations, so the machine word size might not be sufficient for the padding. But you might gain some speedup without going to the full vector width. As regards the contents of the end of the string, I was thinking of having the alloc function zero the last part.

@vicuna
Copy link
Author

vicuna commented Aug 2, 2012

Comment author: @xavierleroy

I don't know when this PR was addressed, but as of release 4.00, String.compare does call caml_string_compare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants