Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array operations not specialization during inlining. #7441

Closed
vicuna opened this issue Dec 25, 2016 · 2 comments
Closed

Array operations not specialization during inlining. #7441

vicuna opened this issue Dec 25, 2016 · 2 comments

Comments

@vicuna
Copy link

vicuna commented Dec 25, 2016

Original bug ID: 7441
Reporter: markghayden
Status: acknowledged (set by @xavierleroy on 2017-01-14T15:24:19Z)
Resolution: open
Priority: normal
Severity: feature
Platform: AMD
OS: MacOS
OS Version: 10.12.1
Target version: later
Category: middle end (typedtree to clambda)
Duplicate of: #7442
Has duplicate: #7440

Bug description

It appears the Array module is not usable for creating optimal code, even for simple array summation.

let stdlib_sumf v =
Array.fold_left (+.) 0.0 v
;;

This allocates 2 floating points (32 bytes on 64-bit) per iteration.

Experiments were with 4.05 trunk with (-O3 and -unbox-closures). For array summation using Array.fold_left, it appears necessary to hand-create a version of Array.fold_left with typecasts specializing to use with floating point arrays, or some other similar method.

Similarly for summing an array of integers. When using Array.fold_left, allocation doesn't occur, but the assembly code generated for the loop includes checks for the type of the array and includes code (never executed) for allocating a floating point value. Similarly, creating a specialized version of Array.fold_left, removes the checks for type of array.

Steps to reproduce

Use attached file. The output below test case and number of bytes allocated summing array with 10,000 floats. All but the inline2 case allocate 32 bytes (2 floats) per iteration. For integer, review the resulting assembly code.

Output from running program.

make -w -k -j4
make: Entering directory `/Users/mhayden/proj/ocaml/flambda'
/Users/mhayden/.opam/macos.dev/bin/ocamlopt -O3 -unbox-closures -c -S a.ml
/Users/mhayden/.opam/macos.dev/bin/ocamlopt -O3 -unbox-closures -o a a.cmx
./a
stdlib 320096
inline0 320096
inline1 320096
inline2 112

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 14, 2017

Comment author: @xavierleroy

See my comment in #7440

@github-actions
Copy link

github-actions bot commented May 9, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant