Mplayer Development And Optimization For Arm 
Mplayer Development And Optimization For Arm 
Dec 5 2006, 02:43 PM
Post
#1


Group: Members Posts: 51 Joined: 8October 06 Member No.: 11,724 
Probably it is a good idea to consolidate efforts and try to submit some of the useful ARM related patches upstream:
http://lists.mplayerhq.hu/pipermail/ffmpeg...ust/014460.html http://lists.mplayerhq.hu/pipermail/mplaye...ber/046207.html I can only test MPlayer on Nokia 770, so can't be sure if any ARM9E (that's the core used in Nokia 770) specific optimizations are also good for Zaurus. So people who are able to compile MPlayer from sources and test it on zaurus are welcome in this thread. One of the examples is the new armv5te optimized idct in MPlayer 1.0rc1, can anybody benchmark it on Zaurus? Also this is not quite ARM architecture related, but libmad based decoder in MPlayer seems to have troubles with variable bitrate audio (it loses sync with video). Some more details can be found here http://lists.mplayerhq.hu/pipermail/mplaye...ust/045017.html and in the followup messages. Any volunteer to investigate this problem? All in all, ffmpeg optimizations for ARM are not nearly as good as for x86, so investing some time in it may provide some performance improvement. 


Dec 27 2006, 12:36 AM
Post
#2


Group: Members Posts: 682 Joined: 26December 05 From: Rochdale, Lancashire Member No.: 8,789 
Hi Serge!
I conducted a bunch of benchmark tests using a Zaurus C3000 running pdaXii13 build4 full which includes Meanies build of mplayer 1.0rc1 (which he has named the binary mplayer3) and I used the same Doom divx clip that you linked in all the tests with the same command you used. For these first four sets of benchmarks the Z was running at the standard 416Mhz setting and the commands were run under an X11 terminal:  idct7: BENCHMARKs: VC: 58.484s VO: 0.088s A: 0.000s Sys: 2.460s = 61.032s BENCHMARKs: VC: 57.614s VO: 0.070s A: 0.000s Sys: 0.848s = 58.531s BENCHMARKs: VC: 57.865s VO: 0.075s A: 0.000s Sys: 0.842s = 58.781s BENCHMARKs: VC: 57.753s VO: 0.078s A: 0.000s Sys: 0.851s = 58.682s BENCHMARKs: VC: 57.837s VO: 0.074s A: 0.000s Sys: 0.835s = 58.746s idct10: BENCHMARKs: VC: 59.045s VO: 0.072s A: 0.000s Sys: 2.366s = 61.483s BENCHMARKs: VC: 59.071s VO: 0.070s A: 0.000s Sys: 0.989s = 60.130s BENCHMARKs: VC: 59.188s VO: 0.071s A: 0.000s Sys: 0.859s = 60.118s BENCHMARKs: VC: 59.163s VO: 0.071s A: 0.000s Sys: 0.855s = 60.089s BENCHMARKs: VC: 59.157s VO: 0.070s A: 0.000s Sys: 0.838s = 60.065s idct16: BENCHMARKs: VC: 54.462s VO: 0.124s A: 0.000s Sys: 2.615s = 57.201s BENCHMARKs: VC: 57.047s VO: 0.078s A: 0.000s Sys: 2.020s = 59.145s BENCHMARKs: VC: 56.930s VO: 0.072s A: 0.000s Sys: 1.586s = 58.588s BENCHMARKs: VC: 53.739s VO: 0.072s A: 0.000s Sys: 0.859s = 54.670s BENCHMARKs: VC: 53.948s VO: 0.070s A: 0.000s Sys: 1.672s = 55.690s idct2: BENCHMARKs: VC: 59.714s VO: 0.070s A: 0.000s Sys: 2.524s = 62.308s BENCHMARKs: VC: 61.109s VO: 0.074s A: 0.000s Sys: 1.822s = 63.005s BENCHMARKs: VC: 60.556s VO: 0.071s A: 0.000s Sys: 0.879s = 61.506s BENCHMARKs: VC: 60.216s VO: 0.070s A: 0.000s Sys: 0.847s = 61.133s BENCHMARKs: VC: 60.157s VO: 0.070s A: 0.000s Sys: 0.898s = 61.125s  For the next four sets benchmarks I overclocked to 624Mhz and quit out of X11 and ran the command under the console for max performance: idct7: BENCHMARKs: VC: 37.560s VO: 0.072s A: 0.000s Sys: 2.349s = 39.981s BENCHMARKs: VC: 38.063s VO: 0.049s A: 0.000s Sys: 0.561s = 38.673s BENCHMARKs: VC: 38.066s VO: 0.050s A: 0.000s Sys: 0.563s = 38.679s BENCHMARKs: VC: 38.078s VO: 0.050s A: 0.000s Sys: 0.560s = 38.688s BENCHMARKs: VC: 38.081s VO: 0.050s A: 0.000s Sys: 0.559s = 38.690s idct10: BENCHMARKs: VC: 36.988s VO: 0.050s A: 0.000s Sys: 0.562s = 37.600s BENCHMARKs: VC: 38.759s VO: 0.049s A: 0.000s Sys: 0.559s = 39.368s BENCHMARKs: VC: 38.770s VO: 0.050s A: 0.000s Sys: 0.563s = 39.382s BENCHMARKs: VC: 38.718s VO: 0.050s A: 0.000s Sys: 0.560s = 39.328s BENCHMARKs: VC: 38.736s VO: 0.049s A: 0.000s Sys: 0.559s = 39.344s idct16: BENCHMARKs: VC: 33.716s VO: 0.050s A: 0.000s Sys: 0.567s = 34.333s BENCHMARKs: VC: 35.310s VO: 0.049s A: 0.000s Sys: 0.559s = 35.919s BENCHMARKs: VC: 35.401s VO: 0.050s A: 0.000s Sys: 0.563s = 36.014s BENCHMARKs: VC: 35.281s VO: 0.050s A: 0.000s Sys: 0.560s = 35.891s BENCHMARKs: VC: 35.354s VO: 0.049s A: 0.000s Sys: 0.559s = 35.962s idct2: BENCHMARKs: VC: 37.474s VO: 0.050s A: 0.000s Sys: 0.565s = 38.088s BENCHMARKs: VC: 39.184s VO: 0.049s A: 0.000s Sys: 0.560s = 39.793s BENCHMARKs: VC: 39.344s VO: 0.050s A: 0.000s Sys: 0.564s = 39.957s BENCHMARKs: VC: 39.183s VO: 0.050s A: 0.000s Sys: 0.560s = 39.793s BENCHMARKs: VC: 39.253s VO: 0.049s A: 0.000s Sys: 0.560s = 39.863s  So, just as on the 770 it would seem idct16 is clearly the fastest 


LoFi Version  Time is now: 25th July 2014  11:07 PM 