learn from here:
http://www.heyrick.co.uk/assembler/its not that hard to pick up at all if you already know assembler.
there are a couple of nice things such as optinal execution of every instruction, there is no "test and branch" or "test and jump" instruction instead there is only a jmp instruction. however as i said there is conditinnal exectuion of every asm instruction, you could combind this with a compare with zero instruction and negate it to jump to a specified piece of code if the perivos statment was not zero
another nice thing is the free "shift" you can do with any argument to an asm instruction, there is a barrelel shifter in the alu that gives you the ability to shift to the right or left as part ofan instruction without effecting exectution time, i cant remeber if there is a shift instruction but i belive there is not
to performe a shift you would exectue a mov instruction to move the contents from a to b but incorperate a shift in the instruction
Shift r1 left arethmatically 3 places and place in r2
mov r2, [r1 #ASL 3 ]
also note that the destinatoin and source are backwards (ie r1 will end up in r2). not sure why they did this but its not hard to get used to
the Program counter is also a register (r15) so a jump can be porformed as a mov instruction
mov PC, r1
Pc is an alias for r16 in this case however the important thing to note is that you can calculate a value to do a computed jump, i dont know if x86 programmers do this but i do know PIC micro and atmel programers do for impleminting jump tables and pulling values out of a table
to do that you would have a list of instructions that do somthing like
mov r1,
mov PC, r14
to jump to this table you would have some code to calculate the index then jump to it and pull the value returned out of r1, the jump would be a bl instruction (branch with link) which is the analog of branch on a x86, the standard branch dosent save the PC and so exectuets what is commanly refered to as a jmp
anyway the code:
Note this is only the jump part
bl r1
this jumps to the code pointed by the second register (r1). this is not a useful thing in most cases, its mainly used where computaition is to be avoided (it can eat huge ammounts of memorey, if i remeber correcttly its 8x the table size) but for some things it can give huge speed benifits (as your answers are precaculated
teh main advantage is it executes in a set ammount of time, hence why its used in real time and embedded (and why i know of it) and that it dosent require a powerful cpu to calcutale the answer
another nice thing about this is very short select case statments because of the conditinol execution part eg
Code as implemented in windows mobile:
cmp 'a'
beq die
cmp 'b'
beq kill_window
cmp 'c'
bne error_msg
cmp 'd'
b lock_up
here we can see the branch instruction mixed with conditinals, eq, ne for example which are branch if equal and branch if not equal, can be mixed with most instructions to do stuff like add if not equal or if greater than x then maxe zero (to cap values)
eg
cmp #70, r1
moveq r1, #0
which is if r1 = 70 then r1 = 0
anyway the site is much better than i am at explaining it and i am a little rusty, if you need more info cantact me however for the iwmmxt stuff its on the intel website. be careful as the iwmmxt is implemented as coprocessor instructions and conflicts with the floating point unit on other platforms, so i would check to make sure your code is actualy running on a iwmmxt processor (just read the processor detaails from /proc)
hope it helps, still here if you need me
and finally from the site:
Any sufficiently advanced technology is indistinguishable from a rigged demo.