Half Precision Floating Point
-----------------------------

July 19, 2009

People periodically mention half-precision floating point support and how it
should be introduced in LLVM, e.g. by introducing a new fp16 type.  An
introduction to half float is here:
http://en.wikipedia.org/wiki/Half_precision

Unlike other floating point formats like ppc128 and x86 fp80 data types, "half 
precision floating point" is a "storage only" format.  While you could do
properly rounded half float arithmetic, and there is software implementations
that do this, there is almost no hardware that does this.  It is simply more
efficient to start half-floats in memory as 16-bit values, but convert them to
real "float" values and do arithmetic in the hardware FPU, converting back to
16-bit encoding when doing a store.  This sort of thing is particularly useful
for images on a GPU, which are large and read-only.

There is a broad variety of hardware support for half-precision floating point,
including VCVT in NEON, VFP, the AMD SSE5 proposal, etc.  However, all of these
are of the form "convert an encoded 16-bit value to float" and "convert a float
to an encoded 16-bit value".

There is some variety in how to implement this in the source language, but one
common way is to provide a __fp16 data type.  Like "char", all operations on
this datatype implicitly promote (in the case of char to int, in the case of
__fp16 to float or double).

Implementing this in LLVM is really easy.  Just add two intrinsics:

i16 @llvm.convert.to.halffloat(float)
float @llvm.convert.from.halffloat(i16)

and add a software implementation of them to compiler_rt.