Hello zap,
Please also try testing atty's build without '-lavdopts idct=16' option (it forces armv5te optimized idct from ffmpeg, but atty's build should be able to use a more efficient iwmmxt optimized idct from IPP).
Anyway, as already mentioned in this thread, there is something wrong with mplayer running on Zaurus devices (or the devices with XScale core). For example, even Nokia 770 with 252MHz ARM9E cpu appears to be faster than Zaurus when playing this matrix video clip (time for decoding it is ~158 seconds). Though intuitively everything should be quite the opposite: Zaurus has a lot higher cpu clock frequency and supports iwmmxt SIMD instructions in addition to armv5te.
TCPMP might be an interesting option (for somebody else to try), but I'm satistied with mplayer/ffmpeg on Nokia 770 and N800 at the moment. Translating mplayer performance on Nokia 770 to 'TCMP percents', it would be something like 118%, and if we try to estimate how it would theoretically run at 624MHz, that would be ~290%. I know that this approximation is wrong as memory speed also does matter a lot, but anyway, looks like both TCPMP and ffmpeg should provide at least comparable performance.
In order to get optimal mplayer performance on Zaurus, somebody just needs to profile it there (doing it with gprof is quite simple), find performance bottlenecks and try to fix them. I might have a look at what's wrong if I got XScale device to experiment with (I had plans to buy some motorola EZX phone, A1200 or E6, but these plans are on hold now).