Version française
Home     About     Download     Resources     Contact us    
Browse thread
Google summer of Code proposal
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Kuba Ober <ober.14@o...>
Subject: Re: [Caml-list] Re: Google summer of Code proposal
On Mar 21, 2009, at 5:28 PM, Michael Ekstrand wrote:

> Joel Reymont <joelr1@gmail.com> writes:
>> On Mar 21, 2009, at 1:38 PM, Jon Harrop wrote:
>>
>>> . You will succumb to ocamlopt's current run-time representation
>>> which is
>>> objectively inefficient (e.g. boxing floats, tuples, records) and
>>> was only
>>> chosen because the compiler lacks capabilities that LLVM already
>>> provides for
>>> you (primarily JIT compilation).
>>
>> This is probably a stupid suggestion but why not have OCaml directly
>> generate machine code, without the use of assembler and linker?

This won't help with anything -- why would it? How is this suggestion
relevant to current discussion?

> Because that would duplicate the code and logic provided by the  
> system's
> assembler and linker (esp. linker).  For every platform (and there are
> many possible combinations!).

The only problem is that the usual notion of a "linker" is somewhat  
broken, even if
what we're after is an embedded platform where all of the linking is  
done
before the code hits the target (no run-time linking!).

I will show a trivial example where it fails bad. The example is in C.

Suppose you have two platform-specific registers used to set the DMA  
address.
The platform has 12 bit addresses.

#define DMAL (*((volatile unsigned char*)0xFFA))
#define DMAH (*((volatile unsigned char*)0xFF0))

The DMAL takes a whole least significant byte of the address. The DMAH  
takes
takes one most significant nibble (bits 11:8) of the address, and the  
nibble must be
left-aligned (occupy bits 7:4 of DMAH).

Now, in your code, you want to point DMA to a static buffer. Thusly

void foo(void)
{
   static char buffer[128];
   DMAL = (unsigned char)&buffer;
   DMAH = (((unsigned int)&buffer) >> 4) & 0xF0;
...
}

Now, while all of the addresses are known constants, there's usually  
no way,
in the object file, to tell the linker the expression for the value of  
DMAH!

Thus, instead of what amounts to two "load immediate" instructions,
you have one immediate load, followed by a lot of brouhaha to shift  
and mask what
amounts to constants known at compile/link time. That's what's usually  
called
premature pessimization.

That's one issue with contemporary compile/assemble/link systems.  
Never mind
that even if the assemblers would support such "elaborate" expressions  
using
link-time constants, the compilers don't generate them anyway!

So, writing the code in assembly won't help you! It's only at link  
time that you
know where the buffer[] will end up... You can of course hack and put  
the
buffer at a fixed address -- some C implementations will even have  
special
ways of doing that (say via gcc's __attribute__ mechanism). That will  
backfire as
soon as you get to interface more pieces of code: you'll be spending  
your time
moving stuff around just to keep the memory regions from overlapping  
-- that's the
linker's job, really.

Heck -- many, many assemblers will silently generate utterly wrong  
code for the load
into DMAH, *if* you code this in assembly, not C!! I've got at least a  
dozen production,
shipping assemblers, that silently trip-and-fall on the code like the  
one above. Of course
they only fail if you code it in assembly, as the C compiler won't  
even attempt such
um, "trickery". Silly stuff, really, requiring no advanced  
optimization theories, just doing
one's darn job well...

You have a choice: either put some ASTs into the object file, whenever  
the
expressions involving link-time constants are involved, or you get rid  
of the whole
compile-assemble-link separation and get everything into one place.

The latter, incidentally, is what I ended up doing in my godawful LISP- 
on-its-way-to-ML
platform for Z8 Encore! and SX48.

This would be, "of course", taken care of by a JIT: it would figure  
out that a whole lot
of nothing is done on constant memory addresses, and would replace all  
the operations
by a final load. But, on a platform where the code is statically  
linked on the host, there's
no need for any of that, nor for a JIT. This applies to a whole lot of  
hard-realtime systems
where a lot of reasoning can be made trivial by only using  
preallocated memory, and not
doing any runtime memory allocation (or at least limiting it well).

> If you use the existing linker, then you can depend on the expertise  
> of
> the authors for each system getting all the logic right for loading
> libraries (which may be arbitrary libraries, when you're using C
> extensions) and producing a binary in the correct format for that
> system.

The "logic" present in many linkers is either pretty trivial, or is an  
ugly hack
for lack of expressiveness in object file records. Then you have link  
time optimizations,
which are really trivial to do in a whole-project compiler, but  
require a lot of
extra effort in a linker, etc.

Heck, many linkers use ad-hoc horrible quadratic-or-worse time  
algorithms that backfire
severely once the size of the project gets sufficiently big. Just  
follow the evolution of
gnu ld in face of C++. A farce in multiple acts, at least.

Cheers, Kuba