![]() |
If you want to learn ARM get the Cheatsheet! it has all the ARM7 commands, it covers the commands, and options like Bitshifts and conditions as well as the bytecode structure of the commands! | ![]() |
![]() |
We'll be using
the excellent VASM for our assembly in these tutorials... VASM
is an assembler which supports Z80, 6502, 68000, ARM and many
more, and also supports multiple syntax schemes... You can get the source and documentation for VASM from the official website HERE |
Cpu |
Instruction set |
System |
ARM2 | Arm v2 | Acorn Archimedes |
ARM60 | Arm v3 | 3D0 (12 Mhz) |
ARM7TDMI | ARMv4T | GBA (16.78) |
Lesson M1 - Random Numbers and Ranges |
Lesson M2 - BCD, Binary Coded Decimal! |
Moving a sprite on RiscOS - Simple ARM Assembly Lesson S1 |
Sprite moving on the GameBoy Advance - Arm Assembly Lesson S2 |
Lesson S4 - Sprite moving on the GameBoy Advance (Thumb) [GBA] |
Lesson SuckShoot1 - SuckShoot General Code [GBA] [NDS] [ROS] |
Lesson SuckShoot2 - SuckShoot GBA Graphics code [GBA] [NDS] [ROS] |
Lesson SuckShoot3 - SuckShoot NDS Graphics code [NDS] |
Lesson SuckShoot4 - SuckShoot RiscOS Graphics code [ROS] |
Lesson 1 - Getting started with ARM Thumb |
Lesson 2 - Addressing modes and rotation |
Lesson 3 - Conditions, Branches, CMP |
Lesson 4 - The Stack� and SWI |
Lesson 5 - More Maths! |
![]() |
With
the ARM we actually have some serious memory resources available
to us, both in RAM or ROM! if you're looking to develop serious games or software, you probably want to use C++, but looking at assembly lets us see how the hardware really works, and that's the point of these tutorials! |
Assemblers will use
a symbol to denote a hexadecimal number, some use $FF or #FF or
even 0x, but this guide uses & - as this is how hexadecimal
is represented in CPC basic All the code in this tutorial is designed for compiling with WinApe's assembler - if you're using something else you may need to change a few things! But remember, whatever compiler you use, while the text based source code may need to be slightly different, the compiled "BYTES' will be the same! |
![]() |
Decimal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | ... | 255 |
Binary | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 | 11111111 | |
Hexadecimal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | FF |
Bit position | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Digit Value (D) | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
Our number (N) | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
D x N | 128 | 64 | 0 | 0 | 8 | 4 | 0 | 0 |
128+64+8+4= 204 So %11001100 = 204 ! |
If you ever get confused, look at Windows
Calculator, Switch to 'Programmer Mode' and it has binary
and Hexadecimal view, so you can change numbers from one form to
another! If you're an Excel fan, Look up the functions DEC2BIN and DEC2HEX... Excel has all the commands to you need to convert one thing to the other! |
![]() |
Negative number | -1 | -2 | -3 | -5 | -10 | -20 | -50 | -254 | -255 |
Equivalent Byte value | 255 | 254 | 253 | 251 | 246 | 236 | 206 | 2 | 1 |
Equivalent Hex Byte Value | FF | FE | FD | FB | F6 | EC | CE | 2 | 1 |
![]() |
All these number types can be confusing,
but don't worry! Your Assembler will do the work for you! You can type %11111111 , &FF , 255 or -1 ... but the assembler knows these are all the same thing! Type whatever you prefer in your ode and the assembler will work out what that means and put the right data in the compiled code! |
Main
Registers:
Added in ARM3:
Special registers for protected modes: R13/14 have alternative versions and there is a SPSR for each of IRQ/SVC UNDEF and ABORT modes FIQ mode has alternate R8-R14 and SPSR A Frame pointer points to data areas in the Stack used by the function, allowing for relative offsets... it's entirely optional if you reall use R11 for this or not. the Intra Procedural Call register can be used as a backup of LR for subroutines |
PC
Flags: NZCVIF------------------------MM CSPR Flags: NZCV--------------------IFTMMMMM
Getting and Setting Flags:
|
Decimal | #1234 |
Hexadecimal | #0x12EF |
Binary | #0b11110000 |
Z80 command | Description | Command |
CALL (no nesting) | Jump to subroutine | BL label |
JP | Jump to label | B label |
RET (no nesting) | Return from linked branch | MOV pc,lr |
CALL - start Sub (allows nesting) | After BL | LDMFD sp!,{pc} |
RET - end Sub (allows nesting) | End of sub (RET) | STMFD sp!,{r0-r12, lr} |
DEC r1 | Decrement r1 and set flags | SUBS r1,r1,#1 |
Push r0 | Put r0 onto the stack | str r0, [sp, #-4]! |
Pop r0 | take r0 off the stack | ldr r0, [sp], #4 |
Push all | Push all+ return address | STMFD sp!,{r0-r12, lr} |
Pop all.. RET |
Pop all + return | LDMFD sp!,{r0-r12, pc} |
LDIR | r12=src r13=dest r14=bytecount+dest |
loop: LDMIA r12!, {r0-r11} STMIA r13!, {r0-r11} CMP r12, r14 BNE loop |
Result | Bitshift |
. . . . . . . . . . . . . . . . . . . . . . . . 76543210 | 0 |
10. . . . . . . . . . . . . . . . . . . . . . . . 765432 | 1 |
3210. . . . . . . . . . . . . . . . . . . . . . . . 7654 | 2 |
543210. . . . . . . . . . . . . . . . . . . . . . . . 76 | 3 |
76543210. . . . . . . . . . . . . . . . . . . . . . . . | 4 |
. . 76543210. . . . . . . . . . . . . . . . . . . . . . | 5 |
. . . . 76543210. . . . . . . . . . . . . . . . . . . . | 6 |
. . . . . . 76543210. . . . . . . . . . . . . . . . . . | 7 |
. . . . . . . . 76543210. . . . . . . . . . . . . . . . | 8 |
. . . . . . . . . . 76543210. . . . . . . . . . . . . . | 9 |
. . . . . . . . . . . . 76543210. . . . . . . . . . . . | 10 |
. . . . . . . . . . . . . . 76543210. . . . . . . . . . | 11 |
. . . . . . . . . . . . . . . . 76543210. . . . . . . . | 12 |
. . . . . . . . . . . . . . . . . . 76543210. . . . . . | 13 |
. . . . . . . . . . . . . . . . . . . . 76543210. . . . | 14 |
. . . . . . . . . . . . . . . . . . . . . . 76543210. . | 15 |
LSL | Logical Shift Left |
LSR | Logical Shift Right |
ASR | Arithmatic shift Right |
ROR | Rotate Right |
RRX | Rotate Right with eXtend (1 bit only - opcode is ROR #0) |
Bytes | Z80 | 68000 | 8086 | ARM |
1 |
DB | DC.B | DB | .BYTE |
2 |
DW | DC.W | DW | .WORD |
4 |
DC.L | DD | .LONG | |
n |
DS n,x | DS n,x | n DUP (x) | .SPACE n,xx |
Param | Mode | Format | Details | Example |
Op2 | Immediate | #n | Fixed value of n Can be any value made by an 8 bit immediate shifted by an even number of bits, eg 0xFF or 0xFF000000 are OK. |
ADD R0,R0,#1 |
Op2 | Register | Rn | value in register Rn | ADD R0,R0,R1 |
Op2 | Register Shifted by Immediate | Rn, shft #n | Shift Register Rn by #n using shifter shft Options: LSL #n, LSR #n, ASR #n, ROR #n, RRX note: RRX can only shift 1 bit |
MOV R0,R1,ROR #2 |
Op2 | Register Shifted by Register |
Rn, shft Rm | Shift Register by Rm using shifter shft Options: LSL Rm, LSR Rm, ASR Rm, ROR Rm |
MOV R0,R1,ROR R2 |
Flex | Immediate offset Immediate pre-indexed |
[Rn,#n] [Rn,#n]! |
value from address in register Rn+n ! means Preindexed, set Rn=Rn+n |
LDR R0,[R1] ;#n=0 LDR R0,[R1,#4] LDR R0,[R1,#-4]! |
Flex | Register offset Register pre-indexed |
[Rn,{-}Rm] [Rn,{-}Rm]! |
value from address in register Rn+Rm ! means Preindexed, set Rn=Rn+Rm |
LDR R0,[R1,-R2] LDR R0,[R1,R2]! |
Flex | Scaled register offset Scaled register pre-indexed |
[Rn, Rm,shft #n] [Rn, Rm,shft #n]! |
value from address in register, if LSL then Rn+(Rm*#n) ! means Preindexed, set Rn=Rn+n |
LDR R0,[R1,R2, LSL #2] LDR R0,[R1,R2, LSL #2]! |
Flex | Immediate post-indexed | [Rn],#n | value from address in register Rn... set Rn=Rn+n (No need for ! - as it's the only purpose of the command!) |
LDR R0,[R1],#4 |
Flex | Register post-indexed | [Rn], {-}Rm | value from address in register Rn... set Rn=Rn+Rm (No need for ! - as it's the only purpose of the command!) |
LDR R0,[R1],R2 LDR R0,[R1],-R2 |
Flex | Scaled register post-indexed | [Rn], {-}Rm, shft #n | Shift Register by #n using shifter shft Options: LSL #n, LSR #n, ASR #n, ROR #n, RRX |
LDR R0,[R1],R2,LSL #2 LDR R0,[R1],-R2,RRX |
We're going to be using VASM
as an assembler, it's a free which works on windows, OSX and
Linux My Devtools provide a batch file which will build the programs for you, but if you don't want to use them, the format of the build script is shown below: ![]() -Fbin ... Specifies to create a Binary file -Dxxx=Y ... Specifies to define a symbol xxx=y (we'll learn about symbols later. -L ... Specifies a Listing file - this shows source code and resulting bytes... it's used for debugging if we have problems -o ... Specifies the output file. %BuildFile%... this would be the sourcefile you want to compile... Eg: Lesson1.asm -m7tdmi... (or equivalent) specifies the ARM architecture we're building for. -chklabels -nocase ... Disable case sensitivity, and check for lines where we've forgotten a tab on a command (it will be mistaken for a label) |
|
Once we've successfully compiled our program, we can run it with
VisualBoyAdvance We'll also use RiscOS, but setting this up is more complex if you're doing it yourself. |
![]() |
There's a lot of
complex scary stuff in the include files - don't panic about
it for now, you'll be able to understand it more later once
you've covered all the lessons. |
![]() |
Lets look at another subroutine. This one stars with a label 'GetNextLine'... we know it's a label because it's not indented and ends in a colon... this is the name of the subroutine - we'll see the name with BL (Branch and Link) statements (calls on the arm). Then there is an ADD Command... it adds 160 to r10 (R10=R10+160)... it is indented, so they are clearly commands... Finally there is a MOV PC,LR command - this ends a subroutine... BL transfers PC (the program counter... the current running byte) to LR - transferring LR back to PC returns to the command after the BL command if our code has a RET at the end - it's a subroutine and should probably be started with a CALL... if we start it with a JMP something bad will probably happen! |
![]() |
ARM calls are very weird...
CALL is called BL - and rather than push the PC (Reg T15 -
Program counter) onto the stack, it move it to LR (reg R14) We can also return by popping a previously pushed LR back into PC Don't worry if you don't understand this yet - this info is just for those familiar with other CPUs- we'll cover it in more detail soon! |
![]() |
![]() |
it
may seem weird we can't set all 8 bytes of a register in one
go, but there's ways around this! it all comes down to the way the instructions turn into bytes - each instruction is 4 bytes - and there's only enough 'space' in the MOV command to set 2 bytes of the register value |
The example here shows data is stored by the
ARM in 'Little Endian' format... meaning the lowest value byte
in a 32 bit register is stored first... and the highest is
stored last. This is basically always the case with the ARM - however the ARM CPU can actually also work in Big Endian mode. |
![]() |
The previous LDR and STR worked with 32 bit registers... but
we'll often want to work with bytes, The ARM allows this with a LDRB and STRB command - they work the same as the other commands, but just load a single byte |
![]() |
We loaded in a byte from TestVal with
LDRB... Note that the 24 unused bits of the register changed to 0 We then added 255 - causing the R1 to expand out of a single byte... We then save back with STRB - because we used a byte command, only the low byte was saved |
![]() |
LDR
and STR work with 32 bit values... LDRB and STRB work at 8
bit...But what about 16 bit? well LDRH and STRH (H=Half) will
load and save 16 bit... but these commands only exist on later processors, the Gameboy Advance uses them fine - but RiscOS can't use them! |
![]() |
![]() |
Because
the ARM is 32 bit, a WORD is 32 bits on arm, rather than 16 bit
like on the Z80 or 68000 VASM uses the statement '.long' to define a 32 bit value - but a LONG on the ARM would typically be 64 bit. To avoid confusion the terms WORD and LONG won't be used in these tutorials - the length will be referred to in bits instead |
Rather than a fixed offset from the address, we can use the value in a register... effectively the resulting address is the sum of the two addresses | ![]() |
The registers will be loaded from their respective offsets. | ![]() |
Just like before, the 'Increment' can actually
be negative - so you can read backwards sequentially as well as
forwards! Isn't the ARM great!?? |
![]() |
![]() |
We're
going to look at some examples of these flags and condition
codes - but really you should try them yourselves! You'll notice commented out code (starting ;) - these are alternative tests you can do to see the conditions in action - Ideally you should try them yourselves, but they'll all be shown on the video! |
These are pretty useless,
but they do technically exist... one that always happens, and one
that never does! AL - jump ALways NV - jump NeVer |
![]() |
Some commands work
with these condition codes, and others dont! Check out the cheatsheet
for the full details! |
![]() |
'Stacks' in assembly are like an
'In tray' for temporary storage... Imagine we have an In-Tray... we can put items in it, but only ever take the top item off... we can store lots of paper - but have to take it off in the same order we put it on!... this is what a stack does! If we want to temporarily store a register - we can put it's value on the top of the stack... but we have to take them off in the same order... The stack will appear in memory, and the stack pointer goes DOWN with each push on the stack... so if it starts at $2000 and we push 2 bytes, it will point to $1FFE As the ARM is 32 bit, we'll push onto the stack 32 bits at a time. |
![]() ![]() |
As we learned, Branch and Link moves the Program (PC) counter
into the Link Register (LR) When we perform a RETurn, the assembler actually creates a MOV PC,LR command... Because we need the LR to be intact to return, we need to back it up somehow if we're nesting subroutines... The easiest solution is to push it onto the stack, and pop it back into the PC... Alternatively, we could transfer it into another register |
![]() |
Here is the changes to the stack and Link Register | ![]() |
SWI stands for SoftWareInterrupt... Like the RST's of the Z80 and the TRAPs of the 68000 these are often used for OS calls...On RiscOS there are a variety of SWI's... To use a SWI we use the commands followed by a byte value... What the SWI does and what parameters need to be passed will depend on the system, you'll need to consult the documentation of that system for details. |
![]() |
we called the show string function, then the end program function | ![]() |
![]() |
If
you're programming the Gameboy Advance then you'll probably
never need SWI... these tutorials use the firmware as little
as possible, so you won't see it much in those either... If you're using the firmware though, you'll have to check the manual for Risc-OS, and beware! there are different versions for later Risc OS versions! |
Lesson
5
- More Maths! We're nearly done... but we need to look at operations that work at the bit level, and a few other important commands... lets take a look! |
![]() |
![]() |
![]() |
We have four kinds of logical operations we can perform on bits. AND = Return 1 where both parameters are 1 - else 0 ORR = (or) Return 1 where either parameter is 1 - else 0 XOR = Flip bits in first parameter where second parameter is 1 BIC = (Bit CLear) Zero bits in first parameter when second parameter is 1 |
![]() |
The results are shown here | ![]() |
Test Operations TST / TEQ
We have two commands which work like Logical operations - but they
do not change the contents of the registers - they just change the
flags. TST = effectively ANDs the two perimeters setting the flags accordingly TEQ = effectively XORs the two perimeters setting the flags accordingly There's two special commands MSR and MRS - we'll look at those next! |
![]() |
Here is the result! |
![]() |
Backing up flags with MRS / MSR
*These commands only exist on later ARM versions* if we want to back up the flags, we can do so with these two commands... the flags are in register 'CPSR'.... we can transfer this to or from another register! MRS will move the flags to a register backing them up MSR will move the flags from a register restoring them |
![]() |
Using Carry for 64 bits!
There may be times when even
32 bit isn't enough - when we do ADDition or SUBtraction that goes
over the limit of a 32 bit register, we can use special commands to
add that carry to a second register - the two registers together
will give us 64 bits! ADC adds a parameter + any carry to the top register. SBC Subtractss a parameter and any carry to the top register. In either case, we need to do an ADDS or SUBS to the low register first - the S means the flags are set, if we don't do this, the carry will never be set |
![]() |
Here are the results, when the bottom byte over/under flowed, the top byte was altered to compensate for the carry/borrow | ![]() |
Multiplication
The ARM has two multiply commands MUL - MUltiplies two parameters together. MLA - MuLtiplies two parameters and Adds a third |
![]() |
The result of the two operation is shown here 3*2=6... (3*2)+1=7 |
![]() |
Negative and reversed commands
We have some special commands, which reverse the order of the
parameters RSB (Reverse SuB) is like SUB - except whereas SUB R0,R1,R2 will set R0=R1-R2, RSB will set R0=R2-R1... there is a carrying version called RSC If we want to transfer a value with all its bits flipped. we can use MVN R0,R1 (MoVe Not) - This will set R0= R1 EOR 0xFFFFFFFF If we want to compare a register to a negated register we can use CMN R0,R1... this sets the flags like ADD, but does not change any registers. |
![]() |
We performed a 64bit reversed subtract, Moved a negated value, and compared a negative | ![]() |
ARM4+ only... 16 bit Move (HalfWord), Swap Ram<->Register
This tutorial primarily covers ARM2, but there's a few later
commands that are really good to know... The first are LDRH and STRH - these (like LDR/STR) are load and store commands - however these work at the HalfWord (16 bit) level... they're handy for the Gameboy Advance screen! another interesting command is SWP - this transfers a Ram address to a register, and a register to the same ram address... The Source/Destination registers can be the same or different. |
![]() |
We loaded in a Half (16 bit)... then stored
the modified Half back to ram We Swapped the ram into R0 and R1 into Ram |
![]() |
We've covered all the basic
ARM2 commands - there are many more in the later revisions, but we
won't be covering them at
this time. We've looked at enough to get started with RiscOS or the Nintendo Gameboy Advance! |
![]() |
Mnemonic | Description | Example |
ADCccS Rn, Rm, Op2 | Add With Carry. | ADC R0,R0,#4 |
ADDccS Rn, Rm, Op2 | Add Op2 to Rm and store the result in Rn. | ADD R0,R0,#4 |
ANDccS Rn, Rm, Op2 | Logically AND Op2 with Rm and store the result in Rn. | AND R0,R0,#4 |
Bcc Label | Branch to a relative Label. | BEQ ConditionalJump |
BICccS Rn, Rm, Op2 | Logically Bit Clear Op2 with Rm and store the result in Rn. | BIC R0,R0,#4 |
BLcc Label | Branch and Link to a relative subroutine Label. | BL TestSub |
CMNcc Rn, Op2 | Compare Negative Rn to Op2. set the flags like"ADDS Rn,Op2" | CMN R0,#4 |
CMPcc Rn, Op2 | Compare Rn to Op2. set the flags, the same as "SUBS Rn,Op2" | CMP R0,#4 |
EORccS Rn, Rm, Op2 | Logically Exclusive OR Op2 with Rm and store the result in Rn. | EOR R0,R0,#4 |
LDMccadm Rn!, {Regs} | Transfer range of registers {Regs} to address in Rn. Like POP | LDMFD sp!,{r0,r1,r2} |
LDRcc Rn, Flex LDRccB Rn, Flex |
Load register Rn from address Flex | LDR R0,NearLabel |
LDRccH Rn, Off LDRccSH Rn, Off LDRccSB Rn, Off |
HalfWord (16 bit), Signed Word (16 Bit) and Signed Byte (8 Bit) load | LDRSB R0,[R1,#-255] |
MLAccS Rn, Rm, Ro, Rp | 32 bit Multiplication and Add. Rn=(Rm*Ro)+ Rp | MLA R0,R1,R2,R3 |
MOVccS Rn, Op2 | Move value in Op2 into Rn. | MOV R0,#0xFF |
MRScc Rn,sr | Move sr (either CPSR or SPSR) to register Rn. | MRS R0,SPSR |
MSRcc sr_f,# MSRcc sr_f,Rn |
Move immediate # or register into flags f of sr (either CPSR or SPSR). | MSR CPSR_F,#0 |
MULccS Rn, Rm, Ro | 32 bit Multiplication. Rn=Rm*Ro. | MUL R0,R1,R2 |
MVNccS Rn, Op2 | Move Not. Flip all the bits of Op2 and move result into Rn. | MVN R0,#0xFF |
ORRccS Rn, Rm, Op2 | Logically OR Op2 with Rm and store the result in Rn. | ORR R0,R0,#4 |
RSBccS Rn, Rm, Op2 | Reverse Subtract. This performs the calculation Rn=Op2-Rm. | RSB R0,R0,#6 |
RSCccS Rn, Rm, Op2 | Reverse Subtract with Carry. Rn=(Op2-Rm)-C . | RSC R0,R0,#6 |
SBCccS Rn, Rm, Op2 | Reverse Subtract with Carry. Rn=(Op2-Rm)-C . | SBC R0,R0,#6 |
STMccadm Rn!, {Regs} | Transfer range of registers {Regs} to the address in Rn. Like PUSH | STMFD sp!,{r0,r1,r2} |
STRcc Rn, Flex STRccB Rn, Flex |
Store register Rn to address Flex. | STR r0,[r1,r2,asl #2] |
STRccH Rn, Off STRccSH Rn, Off STRccSB Rn, Off |
Half Word (16 bit), Signed half Word (16 Bit) and Signed Byte (8 Bit) store | STRSB R0,[R1,#-255] |
SUBccS Rn, Rm, Op2 | Subtract. This performs the calculation Rn=Rm-Op2. | SUB R0,R0,#6 |
SWIcc # | Software Interrupt. | SWI 3 |
SWPccB Rn, Rm, [Ro] | Swap a register and memory. Rn=[Ro], [Ro]=Rm. | SWPB R0,R1,[R2] |
TEQcc Rn, Rm, Op2 | Test for bitwise Equality. Set the flags like "EOR Rn,Rm,Op2" | TEQ R0,R0,#6 |
TSTcc Rn, Rm, Op2 | Test bits. Set the flags like �AND Rn,Rm,Op2" | TST R0,R0,#6 |