Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - tjchick

Pages: [1]
1
Linux Applications / Mplayer Development And Optimization For Arm
« on: November 13, 2007, 05:38:08 pm »
Just a quick update from me, mostly of interest to the angstrom people...

You may remember I hacked mplayer/ffmpeg to actually use iwmmxt rather than just compiling them.

I got VC times of apx 43 seconds for the doom clip running on angstrom.

Now I am using *the same binary* and get VC times of 37 seconds on the latest Angstrom test images, so something has changed, maybe cache support or iwmmxt support in the kernel. Anyhow, my results are now about the same as my tests on cacko with attys mplayer.

If I use the default mplayer included in the angstrom iwmmxt feeds, I see VC of 52 seconds. I'm going to take a look, and try the svn version.

Cheers,
Tim

2
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 23, 2007, 06:00:21 pm »
Quote
You can try to override idct by using '-lavdopts idct=<some_number>'  in atty's build and test it. After getting the numbers we can see if it is really IPP that matters, or maybe atty's build has some other optimizations.

I did try it, and using the non-IPP IDCT produces results which are comparable ish. atty mplayer is still faster by 10% or so, so there are still a few more tweaks I need to sort out, but it was 40% better when using ipp.

Cheers,
Tim

3
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 21, 2007, 11:10:00 am »
On cacko on c1000, I see:
VC: 36.186
VC: 36.927
VC: 37.662
VC: 36.932
VC: 37.016

And similar figures for sys. Cacko uses attys mplayer, which still seems to be the best by quite a margin!

At a guess this is due to IPP for IDCT.

Thanks,
Tim

4
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 15, 2007, 04:05:03 pm »
Quote
Quote
Yes, IWMMX needs OS support, as well as having the right processor. Unfortunatly I (and others) can not find a simple, portable method for detecting this. So the only option is to try and use iwmmxt is it is compiled in - you need to turn on compile switches to get it.
That's probably fine. By the way, you can also try to compile MPlayer with the use of Intel IPP (Integrated Performance Primitives) library and check if it helps to improve performance.
I think it does, as I know the cacko mplayer-atty is faster again than "mine", and that uses the IPP stuff for idct. I was not really interested in trying it though, due to the license restrictions of IPP.

Quote
Quote
I also noted one more thing - the iwmmxt code does not provide the h363_inter function, so I canged ffmpeg to use the armv5 version. This provided a small speed increase.
This should not be a problem as dct_unquantize_h263_inter is not a performance critical function. But it is pretty much similar to dct_unquantize_h263_intra (which consumes a noticeable amount of decoding time, something like ~7%), so implementing it was quite easy. You can see some gprof output with the statistics about decoding this Doom video clip on Nokia 770:


On thing I'm going to do is compare the iwmmxt code against your armv5te code, performance wise.

Cheers,
Tim

5
General Discussion / Cacko.oesf.org Broken?
« on: March 15, 2007, 11:05:37 am »
Quote
this is the latest i have...
[div align=\"right\"][a href=\"index.php?act=findpost&pid=156402\"][{POST_SNAPBACK}][/a][/div]

Thanks! At some point I might try to get this running on OZ.

Thanks,
Tim

6
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 15, 2007, 05:51:04 am »
Quote from: Serge,Mar 14 2007, 06:32 PM
Thanks for the detailed explanation, it clarifies the current situation a lot. When I submitted ARMv5TE instructions support for MPlayer configure, I could not verify that IWMMXT works as well (for an obvious reason, I don't have any device that supports IWMMXT): http://lists.mplayerhq.hu/pipermail/mplaye...ber/046537.html

Please check the latest MPlayer SVN just as Meanie suggested, and if it still has problems with enabling iwmmxt, please try to make a clean fix and submit this patch upstream.

[\quote]
I already did this stuff yesteday, before I saw your messages. Yes Meanie, even latest SVN does not fix matters. I posted a patch to the ffmpeg dev mailing list, got some feedback and posted another patch. Am awaiting the response.

Quote
If you check the first post in this thread, you will see that upstream developers are not very familiar with ARM platform. Only atty did some improvements for MPlayer at some time in the past, but he is unwilling to help upstream to integrate his fixes for whatever reason. So it is up to us (and you as well) to work on improving ARM support in MPlayer (including IWMMXT support). Nobody else can do this job. And upstream developers are not obliged to fix our problems.

PS. I'm sorry if it was me who created a false impression of IWMMXT being fully supported in MPlayer 1.0.rc1

edit: IWMMX has some additional registers, so their save/restore on context switches should be probably supported by the kernel? Maybe these extra checks in mplayer are there to ensure that it is safe to use iwmmxt even though cpu itself may support them? Anyway that was just a wild guess, I'm not familiar with XScale at all.

And thanks for actually digging into the code and checking if iwmmxt really works, the results posted in this thread were suspicious from the very start
[div align=\"right\"][a href=\"index.php?act=findpost&pid=156280\"][{POST_SNAPBACK}][/a][/div]
Yes, IWMMX needs OS support, as well as having the right processor. Unfortunatly I (and others) can not find a simple, portable method for detecting this. So the only option is to try and use iwmmxt is it is compiled in - you need to turn on compile switches to get it.

I also noted one more thing - the iwmmxt code does not provide the h363_inter function, so I canged ffmpeg to use the armv5 version. This provided a small speed increase. So either the version which was in use was pretty good (be warned - it is easy to spend a lot of time writing arm assembler which is *worse* than the compiler output), or the system is memory bound as others have suggested. It might be worth looking at joining together more of the reads and writes if possible (the system uses SDRAM, so the performance for single words sucks compared to 2 words etc, in the case of an overstretched cache)

Here are the new results:
BENCHMARKs: VC:  43.497s
BENCHMARKs: VC:  42.813s
BENCHMARKs: VC:  43.040s
BENCHMARKs: VC:  43.269s
BENCHMARKs: VC:  43.090s

Thanks,
Tim

7
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 14, 2007, 12:29:06 pm »
Quote
Quote
Hmm. It looks like the mplayer 1.0rc1 code includes iwmmxt stuff, but does not actually use it unless you change the code.
Do you really need to change the code to use iwmmx? Isn't it a simple matter of properly running configure?

Did you try using something similar to what I suggested in this thread before?
CFLAGS="-O4 -mcpu=iwmmxt -fomit-frame-pointer -ffast-math" ./configure
make
[div align=\"right\"][a href=\"index.php?act=findpost&pid=156266\"][{POST_SNAPBACK}][/a][/div]

Yes, you really do - the code gets compiled, but not used, as the code is only installed following a test like this:
if( mm_flags & MM_IWMMXT ) -> install dsp code.

It fills in mm_flags wih 0! There is some code to overide this using avctx->dsp_mask & FF_MM_FORCE, but I did not look too hard at getting this going. I wonder if this is related to the lavdopts somehow?

That's why the others only saw a 2% improvment (compiling with the better tune options), and I see a 30% or so improvement.

Tim

8
Linux Applications / Mplayer Development And Optimization For Arm
« on: March 14, 2007, 11:39:55 am »
Hmm. It looks like the mplayer 1.0rc1 code includes iwmmxt stuff, but does not actually use it unless you change the code. I have done this for the results below.

Here are my benchmark results on a standard Sl-C3200, not overclocked, running open zaurus:

BENCHMARKs: VC:  44.056s VO:   0.078s A:   0.000s Sys:   0.831s =   44.965s
BENCHMARK%: VC: 97.9787% VO:  0.1734% A:  0.0000% Sys:  1.8479% = 100.0000%
BENCHMARKs: VC:  43.234s VO:   0.079s A:   0.000s Sys:   0.816s =   44.128s
BENCHMARK%: VC: 97.9734% VO:  0.1785% A:  0.0000% Sys:  1.8481% = 100.0000%
BENCHMARKs: VC:  43.487s VO:   0.076s A:   0.000s Sys:   0.813s =   44.376s
BENCHMARK%: VC: 97.9957% VO:  0.1715% A:  0.0000% Sys:  1.8328% = 100.0000%
BENCHMARKs: VC:  43.669s VO:   0.076s A:   0.000s Sys:   0.820s =   44.565s
BENCHMARK%: VC: 97.9891% VO:  0.1712% A:  0.0000% Sys:  1.8398% = 100.0000%
BENCHMARKs: VC:  43.497s VO:   0.078s A:   0.000s Sys:   0.810s =   44.386s
BENCHMARK%: VC: 97.9976% VO:  0.1764% A:  0.0000% Sys:  1.8260% = 100.0000%

Tim

9
General Discussion / Cacko.oesf.org Broken?
« on: March 13, 2007, 12:53:53 pm »
I'm trying to download:
http://cacko.oesf.org/downloads/kino2/kino2-0.4.3c.tar.gz
(Source to kino2), and I get name or service not known.

I've read some other posts, and they have links to the ROMs, but I can't find the kino2 source anywhere.

Anyone able to help?

Thanks,
Tim

10
Sharp ROMs / Kphone Performance/'sticking'
« on: July 04, 2006, 04:13:33 am »
Did you use the phone from the cacko feed or from the pi website?

I'm using the one from http://www.pi-sync.net/html/kp_pi.html
and that seems to make calls as well as recieve, without freezing, but I was making local calls, with no STUN stuff.

I have c1000 with cacko 1.23

Thanks,
Tim

11
Sharp ROMs / Serial Console On Cacko Rom With Cl1000
« on: July 03, 2006, 06:01:34 am »
Hi List,

Just thought I'd share my experience of setting up a sl-c1000 for development work.

I am doing kernel driver hacking, and I was losing the kernel panic and printk output, so I needed a serial console.

Unfortunatly this was not as easy as it seems!

To get this to work, you need to recompile the kernel with Console support on
serial device option, and you also need to hack printk.c to add the ttyS0 device in register_console(), as the kernel command line options seem to get ignored.

You should then be able to get serial output at 9600,8,n,1 with no flow control.
(Note I only updated the kernel, not the modules)

To actually get the serial output from the device, you need either a sharp serial
cable, which I believe work, or to make up your own cable. I made up my own.

To get the connector I bought a cheap usb sync and power cable for the 5500,5600,6000,C860

This connector is documented for the c700 on the sharp website, but does not appear to be exactly the same for the c1000. The uart pins, gnd and VCC
are the same though.

TXD - 3
RXD - 4
GND - 8
VCC - 11 (3.3V)

(You could make up a complete RS232 interface if you wanted, but this is not
needed for just simple input and output. You would want the pins:
RTS - 5
CTS - 6
DSR - 7
DTR - 14)

You then need a RS232 level shifter chip to change this to +-12V.

Also these signals are the opposite sense to most designs - at idle the lines are at 0v, not 3.3v.

So if you have a level shifter which has an inverter built in (as most do), you need to invert the signals to and from the zaurus.

Hope this helps someone! I can now read my whole panic, and copy and paste the
backtrace, so I'm happy!

Thanks,
Tim

12
General Discussion / Does the C860 have SDIO?
« on: May 11, 2006, 07:27:32 am »
Quote
That's very interesting and useful to know. Are you able to say who you work for - I am wondering if you work for embwise?

So, would it be possible to buy the driver off your company as a binary module in order to satisfy the license fees? I would cheerfully pay US$10 for a driver for my Zaurus in order to be able to use SDIO devices (I presume your driver would allow me to use most sdio devices such as wifi, gps etc?)

thanks for taking the time to enlighten us!
[div align=\"right\"][a href=\"index.php?act=findpost&pid=126371\"][{POST_SNAPBACK}][/a][/div]

I don't want to say who I work for, but it is a silicon vendor and not embwise, so we do not produce and market cards or drivers anyway - a third party would have to do this based on our silicon/reference design. But yes, a company could sell a closed source module and provide it with the card.

Thanks,
Tim

13
General Discussion / Does the C860 have SDIO?
« on: May 09, 2006, 05:35:39 pm »
Hi,

I'm afraid I don't bring any open source software or information, I just wanted to answer the
question as to the level of support of SDIO on a cl1000 (hardware wise), which is very complete.

I also wanted to clarify the license situation, which means that an open source driver is not possible without reverse engineering, and I believe even that is not permitted in some countries?

Anyone selling/giving away SDIO software or hardware must pay royalties, and they are bound not to divulge details of the specs.

So c-guys already ship a driver for zaurus with a wlan card? Sounds good. Does it sit on top of sharps own sd module ( I am using Cacko, and I believe Cacko uses the Sharp kernel module)?

How is it used to access memory cards? Or is it a replacement for Sharps SD module?

Thanks,
Tim

14
General Discussion / Does the C860 have SDIO?
« on: May 09, 2006, 11:11:40 am »
I can confirm my c1000 has working SDIO which supports 4 bit mode and DMA/irqs

I have it working here (at work) with a WiFi card. Even a debug driver is achieving 4.5Mbps, and it should go faster than that.

Now the bad news:

I am not using Sharps sd module, but our own driver. SDIO drivers will not be open source - you have to pay royalties and licenses, and may not distribute the protocol.

So if some 3rd party (comercial) were to make Zaurus WiFi SDIO cards, and ship pre-build modules, this would be entirely possible, at least on a c1000.

Thanks,
Tim

Pages: [1]