OCaml boxes pretty much everything that isn't an integer or a reference (though if you're using arrays/records of floats, those don't suffer an additional level of boxing).
The result is that any type can be represented in a single machine word, making it easy to support polymorphism without specializing code (allowing, e.g., for shared generics).
Keep in mind that OCaml's backend was designed decades ago when machines were much smaller, at a time when C++ was still reluctant to add generics with specialization because of code bloat.
OCaml GC uses bit 0 of a machine word as a flag that the address in that word was allocated under GC. Hence, integers only have 63 bits for the value, and bit 0 saying that this is not a GC-able address. IEEE doubles, however, need all 64 bits. You can't chop off bit 0 for a different use.
Let me summarize: academic language, it's easy for the GC that way, and "if you though that Float boxing would kill performance you wouldn't have started down this path"
Afaik flabmbda can do some float unboxing, and float unboxing is not impossible until you use generic value. Nope, OCaml is not "academic" but a very pragmatic language.
To allow IEEE compliant float with tagged pointers.
However, note that floats are not boxed in the OCaml's arrays. Boxing floats is a pain because it makes the price of defining subroutines taking float arguments high.