Author Topic: Software floating point as a library versus kernel emulation (Read 4780 times)

ikm · « **on:** April 27, 2004, 01:32:47 pm »

We all know that Z has no floating point unit. Therefore, all floating point operations have to be emulated.

The current pdaxrom kernel still uses Netwinder floating point emulation. It works fine and all, but it is quite slow. While there is an option to use alternative in-kernel floating emulation, which is faster but doesn\'t always work, there is a better way to cope with floating point issues.

There is a possibility to tell gcc to use library functions to emulate floating point operations, rather than to emit floating point instructions which are then emulated by the kernel. This approach would be much more faster. The relevant gcc option is \'-msoft-float\'.

Unfortunately, when I try using it with SDK, gcc complains about crtn.o / crtend.o using hardware FP, while the program itself uses software FP. It looks like the current environment needs to be reconfigured and recompiled by the pdaXrom team to make this work.

Answering the question about whether it really works or not: it does. Once upon a time I had a natural Debian running on my Zaurus. I tried that trick there with a simple program and I got 2x floating point speedup. The \'-msoft-float\' option just worked there. The only thing it needed was \'libfloat\' library (it was apt-get\'able).

Unfortunately, it doesn\'t seem to work in pdaXrom. It would be very good to move to the in-library software floating point entirely, as it is the most natural way of execution, and, most notably, it is significantly faster.

This is a call to pdaXrom developers. I would very much like you people to look at this issue. This may be a great improvement to the much beloved pdaXrom. That would help us all.

///

p.s. I abandoned all attempts to get natural Debian working acceptable on Z since then -- the main issue was poor X server. I had my breath hold when I found the pdaXrom project. Now I switched to it entirely and enjoying the thing. I just would very much like to have floating point support improved though

Zazz · « **Reply #1 on:** May 12, 2004, 12:55:33 pm »

I looked a little into the matter. It seems mainly a gcc2 vs gcc3 problem. gcc-2.95.4 works fine with -msoft-float and seems to be able to mix and match between soft and hard fp emulation without problems. I was not able to get -msoft-float to work under gcc-3.x.x, any version, with any set of configure options (compiles but does not link). OTOH, gcc-2.95.4 is not such a bad compiler, so it would make sense to keep it around and use it with -msoft-float for those fp intensive tasks.

The following is for gcc-2.95.4, native or cross, with a libfloat.a compiled from debian sources and a libm called libmsoft.a compiled with -msoft-float from standard glibc-2.2.5 sources. Compiling libc or parts thereof is of course no fun, and I was not yet able to get a shared libmsoft.so (which would be desirable since libm is huge, e.g. doubles gnuplot\'s binary size).

Note especially how gnuplot.hard wastes all the time in sys kernel fp emulation while gnuplot.soft does things so much faster in user space. Note also the slightly lower accuracy of the soft fpe which seems an acceptable trade-off for many applications.

Does anyone have any idea how to get things to work in gcc3 (without recompiling the complete libc and thereby losing binary compatibility to existing apps)?

root@zaurus(pts1):/tmp# head -3 whetstone.c
/*
* C Converted Whetstone Double Precision Benchmark
* Version 1.2 22 March 1998
root@zaurus(pts1):/tmp# gcc-2.95 -mhard-float -O2 whetstone.c -o whetstone.hard -lm && ./whetstone.hard

Loops: 1000, Iterations: 1, Duration: 84 sec.
C Converted Double Precision Whetstones: 1.2 MIPS
root@zaurus(pts1):/tmp# gcc-2.95 -msoft-float -O2 whetstone.c -o whetstone.soft -lmsoft && ./whetstone.soft

Loops: 1000, Iterations: 1, Duration: 47 sec.
C Converted Double Precision Whetstones: 2.1 MIPS

root@zaurus(pts1):/tmp# cat test.gp
#!/usr/bin/gnuplot
set term png; set out \"test.png\"
set data sty lin
set xr [-3:3]; set yr [-3:3]
set samp 41; set isosamp 41
set hid
set cont base
set cntrpar lev inc -1,0.25,1
set ticsl 1.5
splot sin(x)*cos(y)
root@zaurus(pts1):/tmp# time ./gnuplot.hard test.gp

real 0m57.047s
user 0m6.030s
sys 0m51.000s
root@zaurus(pts1):/tmp# time ./gnuplot.soft test.gp

real 0m12.365s
user 0m12.270s
sys 0m0.090s
root@zaurus(pts1):/tmp# echo \'print 2.0/3.0\' | ./gnuplot.hard
0.666666666666667
root@zaurus(pts1):/tmp# echo \'print 2.0/3.0\' | ./gnuplot.soft
0.666666666666717

ikm · « **Reply #2 on:** May 12, 2004, 03:15:38 pm »

Thanks for investigation, Zazz. Now I recall I was using 2.95.4 on Debian, so that\'s right, seems to be a 3.x problem. I tried to find some quick solution to the problem before I posted the original message, googled for quite some time etc, but failed. In case the problem arises from the calling conventions differences, it would look like that the whole thing should really be recompiled, but I\'m not sure about this... I really do not know how floating point parameters are passed, e.g., can they be passed in some FPU registers etc.

Btw, if to talk about recompiling libc, is it really such a major issue? I mean, about 95% of all software for pdaxrom is currently in one place, in the feed. It can all be recompiled, I guess. In any case, if the recompilation is really needed, it is better to do it now, than to linger to the point when it\'s just way too hard.

Zazz · « **Reply #3 on:** May 12, 2004, 03:39:33 pm »

Recompiling libc is not the issue, losing binary compatibility is. Generic arm binaries would no longer run on the new system. Under gcc-2.95, libfloat.a and libmsoft.a are drop-in replacements into an otherwise completely unaltered and compatible setup. Why should it be different for gcc-3.x? We must be missing something here... :?

News:

Author Topic: Software floating point as a library versus kernel emulation (Read 4780 times)