Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005731OCamlOCaml backend (code generation)public2012-08-16 20:442013-06-14 22:19
Reporterjeffsco 
Assigned Tomeurer 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformARMOSDebianOS Versionarmel 2.6.32-5
Product Version4.00.0 
Target VersionFixed in Version4.00.1+dev 
Summary0005731: Register d7 trampled by float_of_int
DescriptionCode generation error. A value is loaded into d7. A call to float_of_int generates code that loads s14, which overwrites the value in d7 (low order mantissa). The value in d7 is then used in a later computation, but the value has been destroyed.

If I make small changes to the code, the load of d7 happens *after* the float_of_int computation, and things work OK. Possibly some constraints on instruction orderings aren't being applied properly.
Steps To ReproduceNeed a version of ocamlopt with support for armv7 with hardware floating point (VFPv3).

Then compile the following code:

let rate_pos scounts : float =
    let m_MIN = -999.0
    in let max1s = Array.make 14 m_MIN
    in let max2s = Array.make_matrix 14 14 m_MIN
    in let try_build (k1: int) (m: float) : unit =
        let denom = 12
        in let try1b (sawk1, xct) k =
            let () =
                if max2s.(k1).(k) > m then
                    let adjm = if m <= m_MIN then 0.0 else m
                    in let numer =
                        if k = k1 then 48
                        else if sawk1 then 36
                        else 24
                    in let f = float_of_int numer /. float_of_int denom
                    in let () =
                        if max1s.(k1) <= m_MIN then max1s.(k1) <- 0.0
                    in
                        max1s.(k1) <-
                            max1s.(k1) +. (max2s.(k1).(k) -. adjm) *. f
            in
                if k = k1 then
                    (true, xct)
                else
                    (sawk1, xct + scounts.(k))
        in
            ignore (List.fold_left try1b (false, 0) [])
    in let () = Array.iteri try_build max1s
    in
        0.0

My compile line looks like this:

    $ ocamlopt -ffpu vfpv3 -c -S rate.ml

I'm attaching the full assembly code output. The section with the error contains the code for the following lines:

   in let f = float_of_int numer /. float_of_int denom
   in let () =
       if max1s.(k1) <= m_MIN then max1s.(k1) <- 0.0

The erroneous code looks like this (with added annotations):

       ldr r12, [r2, 0000016] @ r12 <- m_MIN block
       mov r0, r7, asr #1
       ldr r7, [r2, 0000020]
       movs r6, #0xc @ r6 <- denom
       fmsr s14, r6
       fsitod d10, s14 @ d10 <- float_of_int denom
       ldr r6, [r7, #-4]
       fldd d7, [r12, #0] @ d7 <- m_MIN
       ldr r12, [r2, 0000028]
       fmsr s14, r0 @ *** d7 is destroyed here ***
       fsitod d9, s14 @ d9 <- float_of_int numer
       cmp r12, r6, lsr 0000010
       bcs .L111
       add r6, r7, r12, lsl #2
       fldd d6, [r6, #-4] @ d6 <- max1s.(k1)
       fdivd d8, d9, d10
       fcmpd d6, d7 @ *** This comparison fails ***
       fmstat
       bhi .L104
Additional InformationI ran my tests on a stock OCaml 4.00.0 compiler in an emulated Linux ARMEL environment under QEMU on OS X. I don't think this affects the results; I see exactly the same instruction sequence in my experimental cross-compiler builds (compiling for iOS under OS X).

Here is my configure line:

    $ ./configure --host armv5tejl-unknown-linux-gnueabihf

(Need to use gnueabihf to get VFPv3 support.)

Then build ocamlopt as usual.
TagsNo tags attached.
Attached Files? file icon rate.s [^] (8,532 bytes) 2012-08-16 20:44

- Relationships

-  Notes
(0007949)
meurer (developer)
2012-08-17 08:53

Jep, the issue is in the scheduling code, that doesn't care about the destroyed_at_oper constraint. There are several possible fixes for this. Either we adjust schedgen to respect destroyed_at_oper, or don't use scheduling at all on arm (just like amd64/i386), or we adjust the arm code generation to avoid the hidden s14 usage for inttofloat, floattoint and single Load/Store.

Xavier, any opinions on this?
(0007956)
xleroy (administrator)
2012-08-20 09:06

Tentative fix to the scheduler committed in SVN trunk, revision 12865. Needs review and testing before inclusion in 4.00 bugfix branch.
(0007958)
meurer (developer)
2012-08-21 08:39

Jeff, can you test the fix with your iOS backend?
(0007959)
jeffsco (reporter)
2012-08-21 08:41

Yes, I'll test first thing tomorrow (Tuesday morning Seattle time).
(0007974)
jeffsco (reporter)
2012-08-22 19:48

I installed the new schedgen.ml into my modified version of 4.00.0+beta2 and rebuilt the app that was failing before. By eye, the spot that was broken before is fixed (d7 is loaded after the float_of_int calculation). I ran the app, and it works perfectly on the device. I also built and ran our other app (another card game), and it also ran perfectly. The whittled-down example above also looks perfect by eye (as I would expect). So, things are looking excellent to me.

One limitation of this test is that I installed Xavier's scheduling change into my worktree. If you like I can instead apply my iOS patches to the INRIA trunk and try that instead. The results will almost certainly be the same, but it feels slightly more rigorous (though a little more trouble for me to set up). Let me know if you'd like me to run this test (or any other test).

(Also let me know if I'm supposed to click a button somewhere in Mantis to say I've given feedback! I'm hoping these notes are sufficient.)
(0007976)
meurer (developer)
2012-08-23 08:47

Ok, so this is fixed in trunk then. Xavier, do you want to merge that into the 4.00 branch?

Concerning the iOS merge, I already started work on this here:

https://github.com/bmeurer/ocaml-arm/tree/bm/ios [^]

You may want to clone that and contribute the missing bits and pieces. I'll merge everything into upstream once working and time permits.
(0007977)
xleroy (administrator)
2012-08-24 10:15

Fix to schedgen.ml commited in 4.00 bugfix branch, r12876. Will be released with 4.00.1.

- Issue History
Date Modified Username Field Change
2012-08-16 20:44 jeffsco New Issue
2012-08-16 20:44 jeffsco File Added: rate.s
2012-08-17 08:30 meurer Assigned To => meurer
2012-08-17 08:30 meurer Status new => assigned
2012-08-17 08:53 meurer Note Added: 0007949
2012-08-20 09:06 xleroy Note Added: 0007956
2012-08-21 08:39 meurer Note Added: 0007958
2012-08-21 08:39 meurer Status assigned => feedback
2012-08-21 08:41 jeffsco Note Added: 0007959
2012-08-21 08:41 jeffsco Status feedback => assigned
2012-08-22 19:48 jeffsco Note Added: 0007974
2012-08-23 08:47 meurer Note Added: 0007976
2012-08-23 08:47 meurer Status assigned => resolved
2012-08-23 08:47 meurer Fixed in Version => 4.01.0+dev
2012-08-23 08:47 meurer Resolution open => fixed
2012-08-24 10:15 xleroy Note Added: 0007977
2012-08-24 10:15 xleroy Fixed in Version 4.01.0+dev => 4.00.1+dev


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker