It's like that: hard-float means you just compile floating point instructions into the binary, no matter whether your CPU has a FPU or not. Stumbling over such an instruction, the CPU then raises an exception, the kernel gets active and calculates the expression for you (in place of a "real" FPU) and hands it back to the application layer. This is slow.
soft-float more or less means that the glibc and the gcc work together in substituting floating point instructions on the fly w/ the appropriate emulation. This is faster because no exception and kernel is involved.