the stuff i was talking about was arm apecific, not xscale however i am not sure if gcc can optimise to that level
what i like about the xscales is they sacrificed some power saving tradeoffs for raw speed, word on the net is that the other arm manufactures arent happy with this which is the reasong behind the arm11 chips, arm tried to release a chip to stay competitive with intel.
some other cool stuff from the arm chips is the 16 registers that can also act as index registers or even the PC if you wanted, in fact R16 is the PC which allows you to add 16 to R16 and jump ahead 16 instructions!
what intel has really done is polish the entire ARM packeage by adding insane amounts of cache compared to the standard chip and heavy pipling to bump up the clock speeds. in fact intel even tests the low power tech on the xscales before it goes in the P4's and as far as i know the xscales are the only ARm chips manufactured at 90nm
the iwmmx is nice as well, all the mmx instructions AND the integer sse ones as well (not sure how many there were thogh)
not sure about the optimisation, dont think so but am currently trying to find out. i belive its because the gcc toolchain is so closly aligned with the x86 arch that they might not think of somthing like this, however i might be wrong
hopefully i should get the cpus' up some time next year, but if you are looking for real power there is another intel chip thats been doing the CELL mini cpu thing for a couple of years now, its a 1.1Ghz xscale core with 4 900mhz mini cpus and supports up to 12GB of RAM in total and PCI-e see:
http://www.intel.com/design/network/produc...ily/ixp2350.htmi will have to look into the ENIAC chip, was it a VLIW cpu?
ive gone off topic already if you want more info email me