|The 8086 was the successor to the
8080, from beginnings similar to the Z80, the 8086 was designed to
set a foot into the 16 bit world!
In a 40 pin form and with segments to allow it to break out of the limits of a 16 bit address bus, the 8086 was the competitor to the 68000.... and while inferior in some ways - it was set to dominate the computing industry, blasting all it's rivals away, killing mighty giants like the PowerPC, the Sony CELL and even the ITANIUM! - only the highly efficient ARM processor today has managed to stand up to it's power!
Lets take a look at the beginnings of the 8086, and we'll also look a little at what was added to this chip, in the modern systems we use today!
In these tutorials we'll be looking at MS-DOS based IBM PC's and the WonderSwan
|If you want to learn 8086 get the Cheatsheet! it has all the 8086 commands, It will help you get started with ASM programming, and let you quickly look up commands when you get confused!|
using USAM as our assembler for these tutorials
You can get the source and documentation for UASM from the official website HERE
8086 is pretty old now, but it's the basis of all the
computers we have today...
With the 8086, We can learn about the fundamentals of computing and we can have some fun along the way!
symbols used to denote numbers vary between assemblers, in
these 8086 tutorials we use UASM, and --h is used for
hexadecimal, eg 0FFh, and ---b for binary, eg 1010b
If your not using UASM, you many need something different!
|Digit Value (D)||128||64||32||16||8||4||2||1|
|Our number (N)||1||1||0||0||1||1||0||0|
|D x N||128||64||0||0||8||4||0||0|
|128+64+8+4= 204 So %11001100 = 204 !|
|If your maths sucks and you
can't figure it out, look at Windows Calculator, Switch to
'Programmer Mode' and it has binary and Hexadecimal view,
so you can change numbers from one form to another!
If you're an Excel fan, Look up the functions DEC2BIN and DEC2HEX... Excel has all the commands to you need to convert one thing to the other!
|Equivalent Byte value||255||254||253||251||246||236||206||2||1|
|Equivalent Hex Byte Value||FF||FE||FD||FB||F6||EC||CE||2||1|
|All these number types can
be confusing, but don't worry! Your Assembler will do the work
You can type 0b11111111 , 0xFF , 255 or -1 ... but the assembler knows these are all the same thing! Type whatever you prefer in your ode and the assembler will work out what that means and put the right data in the compiled code!
String commands Copy DS:SI to ES:DI
The Registers are 16 bit, but the address bus is 20 bit...
these 'Segment registers' are added to the top 20 bits of the address eg:
----DDDDDDDDDDDDDDDD ... D=DX
EEEEEEEEEEEEEEEE---- ... E=ES
|There are some special addressing ranges you'll want to know about...||
|0 - Div0||0000/1||IP - Offset|
|0002/3||CS - Segment|
|1 - Step Trap||0004/5||IP - Offset|
|0006/7||CS - Segment|
|2 - NMI||0008/9||IP - Offset|
|000A/B||CS - Segment|
|3 - 1 byte INT||000C/D||IP - Offset|
|000E/F||CS - Segment|
- Sign Overflow
||0010/1||IP - Offset|
|0012/3||CS - Segment|
|5 - reserved||0014/5||IP - Offset|
|0016/7||CS - Segment|
|31 -reserved||007C/D||IP - Offset|
|007E/F||CS - Segment|
|Mode||Description||Sample Command||Valid Registers|
|Register Addressing||An 8 or 16 bit register||mov ax,bx
|Immediate Addressing||A constant value||mov ax,100|
|Direct Memory Addressing||A fixed location in memory||mov ax,[1000h]
INC BYTE PTR [MyData]
|We may need to specifiy
WORD PTR [label]
|Register indirect Addressing||Contents of register used as an address||mov ax,[bx]||[BX], [BP], [DI], [SI]|
|Based or Indexed Addressing||Contents of register (Base or Index) plus displacement||mov ax,[bx+4]||d+[BX], d+[BP], d+[DI], d+[SI]|
|Based, Indexed Addressing||Contents of base register plus contents of index register||mov ax, [bx+di]
mov ax, [bx+si]
mov ax, [bp+di]
mov ax, [bp+si]
|Based, Indexed Addressing with displacement||Sum of base register, index register, and displacement||mov ax, table[bx][di]
mov ax, table[di][bx]
mov ax, table[bx+di]
mov ax, [table+bx+di]
mov ax, [bx][di]+table
|String Addressing||String commands.||MOVSB
|Src = DS:SI
Dest = ES:DI
|If we want to read from an address specified by label we must
specify the size of the data we want to read from the address...
there are 3 commands we can use:
BYTE PTR - Load a byte
WORD PTR - Load a Word
DWORD PTR - Load a DoubleWord (386+)
dw 1234h ;2 bytes of data
mov ax,WORD PTR [cs:somedata] ;This will work
mov ax,[cs:008Ch] ; if somedata=008Ch this will work
mov ax,[cs:somedata] ;This will NOT work
|Set Trap Flag||Command to set trap flag (preserves other flags)|| pushf
or word ptr [bp],0100h ; Set Trap flag (T)
|Set Trap Flag Quick||Command to set trap flag|| mov ax,0100h
;Clear the trap flag (T)
|Clear Trap Flag||Command to clear trap flag (preserves other flags)|| popf
and word ptr [bp],0FEFFh ; Clear Trap flag (T)
|Clear Trap Flag Quick||Command to clear trap flag|| mov ax,0000h
;Clear the trap flag
1 - Getting started with x86
Lets start looking at some simple commands, and get the hang of the 8086 registers!
These tutorials will use UASM to build... DosBox to run compiled code, and we'll use a simple monitor... you can download all the tools in the links to the right
There's a video of this lesson, just click the icon to the right to watch it ->
|We're going to be using UASM
as an assembler, it's a free Microsoft MASM assembler which
works on windows, OSX and Linux
My Devtools provide a batch file which will build the programs for you, but if you don't want to use them, the format of the build script is shown below:
-mz ... Specifies to create DOS EXE File
-Dxxx=Y ... Specifies to define a symbol xxx=y (we'll learn about symbols later.
-Fl ... Specifies a Listing file - this shows source code and resulting bytes... it's used for debugging if we have problems
-Fo ... Specifies the output file.
%BuildFile%... this would be the sourcefile you want to compile... Eg: Lesson1.asm
|Once we've successfully compiled our program, we'll run it with
If we want the program to start automatically we'll need to add a few extra lines to the dosbox.conf
|To allow us to get started programming quickly and see the
results, we'll be using a 'template program'...
This consists of 3 parts:
A Generic Header - this will set up the screen and a few parameters we'll need to start.
The Program - this is the body of our program where we do our work.
A Generic Footer - this gives us some support tools, and includes a common bitmap font.
This template program will compile on any of the systems in these tutorials (DOS and the WonderSwan!)
|There's a lot of
complex scary stuff in the include files - don't panic about it
for now, you'll be able to understand it more later once you've
covered all the lessons.
|Lets take a look at a simple program!...
The first line is a command 'CALL'... this runs the subroutine labeled 'DoMonitor' - when that subroutine finishes (with a RET command) the program will carry on with the line after the call... notice the command starts indented *this is required for commands*
the next line is not indented and ends with a colon : - that makes it a label called 'infloop' ... labels tell the assembler to 'name' this position in the program - the assembler will convert the label to a byte number in the executable... thanks to the assembler we don't need to worry what number that ends up being...
finally we have the command 'JMP'... a jump! unlike a call, it never returns... notice we're jumping to the label we just defined on the line before.... this makes the program run infinitely... a crude way to end our program so we can see the result!
you'll also notice text in green starting with a Semicolon ; - this is a comment (REMark) - they have no effect on the code
|Lets look at another subroutine.
This one stars with a label 'DoMonitorAXBX'... we know it's a label because it's not indented and ends in a colon... this is the name of the subroutine - we'll see the name with call statements.
Then there are 3 Calls... these are indented, so they are clearly commands... because Calls return after running, first subroutine 'DoMonitorAX' will occur, then 'DoMonitorBX', then 'Newline'
Finally there is a RET command - this ends a subroutine... RETurning back to the CALL statement that started it.
if our code has a RET at the end - it's a subroutine and should probably be started with a CALL... if we start it with a JMP something bad will probably happen!
|There may be times you see
code do weird things with CALL,JMP and RET statements that
aren't as simple as this... Don't worry about it for now -
It's to complex for you right now... but don't worry, soon you to will be able to wield awesome ASM power!
|It's time to start loading
data into 'registers'...
Registers are the small bits of memory in the processor we use to store values we want to perform calculations on...
AX is a 16 bit register... we can load it with two bytes (one word) '1234h' with the MOVe command...
The AX on the left is the destination of the command.
the 1234h on the right is the source
1234h is MOVed into the register AX
finally we call DoMonitorAX - it will show the result to the screen.
|Here's the result|
|16 bit register AX is made up of two bytes... a High byte and a Low byte... we can use AX as two 8 bit registers called AH and AL... we can MOVe single byte values into these in the same way.|
|First we changed the low part (AL) then the high part (AH)... the two parts of AX are changed accordingly.|
isn't the only register we can do this with... we also have
BX,CX and DX...
There's other registers like DI and SI - but these can't be split into 8 bit parts - they are for 'special purposes'
|We've been using Hexadecimal up until now... Hexadecimal
is pretty much the 'standard' for ASM programming on the x86
- but it's not to friendly for humans!...
the H at the end of 1234h told the assembler it was a HEX number... If we take the H off it would be treated as a Decimal number...
There's times we'll want to use Binary - by putting a B on the end - or Ascii (etters) by putting things in single quotes ' '
In this example we'll load AX with 128 in decimal...
Well then use some new commands SUB and ADD... these will SUBtract and ADD a value to AX... we'll see the result after each command.
Finally we'll load an ascii character into AL.. the assembler convert it to the number code for that character.
|Here is the result|
|Of course... we don't just have to do these commands with
'fixed' values (immediate), we can use the value of one register
on the value of another...
For example we can add 8 bit part AL to AH
|Here's the result... AH has gone from 20h to 92h (+72h)|
|We can do the same commands with BX,CX and DX.... they also have
H and L parts (BH,CL etc)
Lets give 16 bit register BX a value (666h) ... then let's MOVe that value to AX
|The results can be seen here...|
|There may be times we want to swap the values in two registers
over... we could do two MOV commands, and store in a temporary
register like CX... but that would be wasteful...
We have a special XCHG command.. this will swap two 16 bit registers over... or even two 8 bit parts over!
Lets use it to do some swaps!
|We can see the results of the swaps...
first we swapped AX and BX.
Next we swapped AH and AL
Finally we swapped BH and AL
|We've looked at the ADD and SUB commands for increasing and
decreasing values - but there is another way!
Many times we'll want to change a register - increasing it, or decreasing it by 1 - we can use INC and DEC for this.
|Here's the result
These commands use fewer bytes so are faster and smaller... they'll be handy for loop counters and things like that.
|Lesson 1is over!... What we've covered here may
seem confusing - and it may be a little disappointing!
If you don't understand what you've seen, then try changing some of the values and writing your own commands, it should be more clear...
If you don't see how this can help you write games... don't worry, you need to understand a lot of commands - but they'll soon build up and it will be more clear.
2 - Addressing Modes
Our commands have only used fixed immediate values or other registers until now... but of course the x86 can do so much more... lets check out all the modes the x86 supports!
|This is the mode we used in the first lesson - we're just load
in fixed values specified in the ASM code...
This works the way we saw before with AX,BX,CX and DX - and their 8 bit counterparts
|Here is the result|
|As well as the 'main' registers - we have 'Segment registers' -
these have special purposes for addressing.... we'll learn more
about them soon!
The Segment registers CS,DS ES and SS cannot be set via immediate methods - we'll have to load another register then transfer the value.
|ES has changed as requested|
|WE also looked at this before as well -
'Register Addressing' is where both source and destination parameters of a command are registers
|Here is the results|
We can read from an address by specifying it with square brackets [addr]
This is where our 'Segments' come into play (CS/DS/ES/SS registers)
The 8086 uses a 20 bit address bus - but the registers are only 16 bit - how do we specify the remaining 4 bits?
Well, these are taken from a 'Segment' register! The bottom 16 bits are taken from a register or fixed immediate value
A segment register is added to the top 16 bits of the address
For example if we use register AX, by default the DS segment register will be used... the resulting register will be:
|Lets set our data segment up to point to our code... we can set
the AX register to our code segment by specifying @Code
Next lets read AX from offset 000Ch...
(Don't worry about the PUSH and POP statements - we'll cover them soon)
|We've loaded in a word from RAM... not the bytes are reversed - because the system is Little Endian|
|Lets define some test data that we can read back later...|
|We can load a register from an address label... for example the
'SomeData' label in our code segment.
When we want to load from a label in this way, we need to specify this with an extra specifiet:
BYTE PTR tells the assembler we want to load an 8 bit BYTE from the label
WORD PTR tells the assembler we want to load an 16 bit WORD from the label
If we put a code segment name and colon before an address - we will override the segment
which otherwise defaults to DS... for example, lets load from the code segment CS
|We've loaded two words from our test data.|
|We can save data to ram addresses in the same way.|
|The values are saved back to ram|
|Before we use this mode, we need to learn how to calculate the
Segments and offsets within those segments in our code..
We want to get the segment and address of the 'somedata' label. We can get the Segment with the SEG statement, and the offset with the OFFSET statement
|Now we have the registers set up... lets use Register Indirect
|Lets load some data from ES:BX (Data Segment ES - offset
We can use special registers DI and SI (Destination Index and Source Index) in the same way.
In the first examples we specified the Extra Segment - if we don't it will default to DS
|we were able to load AX and BX from our test data...
When we re-read in AX without specifying ES: it read in from the Data Segment DS - and read in 0000
|We looked before at reading in from a label with a numberic
BUT we can do the same with a value in a register!... we can specify a register and a immediate numeric offset!
We can read in from a register with offset... and write back in the same way...
|BX was 000Ch - so when we read in we got address BX+1 - $000Dh
We added 2 to bx making it $000Eh, so when we wrote back to BX+1 , we wrote to $000F
|We can define a symbol with an offset position, and use that
register as a pointer to a bank of settings.
For example lets define 'SecondVariable' at position +2 , and a third called 'ThirdVariable' at position +4
We can now Read it's value with this or write it back.
We could change the BX pointer to change the bank - for example to use a common player routine for 2 different players
|We read in AX from BX+2 (as Secondvariable=2)
we wrote back to BX+4 (as ThirdVariable=4) the value 6655h
|We've used positive base
pointers here, but we can use negative ones too!
Unlike the Z80, On the 8086 the displacement can be an 8 or 16 bit number!
|We can use registers DI/SI as an offset with BX/BP - where the
two registers are added together...
There's a limit to the registers we can use for this... we can use BX+DI, BX+SI, BP+DI, BP+SI
|We've read in from the addresses in the test data resulting from the BP+DI calculation|
these kind of commands only work with BX+DI, BX+SI, BP+DI,
Whats worse, the assembler won't warn you if you're being stupid and trying to use them - the code won't work... so be careful, or you'll waste hours trying to work out why your code isn't working!
|Combining the previous two ideas, We can use a Base Register -
an index register, and an immediate displacement
We can also use BX+DI+n, BX+SI+n, BP+DI+n, BP+SI+n
|The calculation will be applied, and the correct addresses loaded.|
|Different assemblers and source code may use different
Depending on the syntax, you may see the displacement outside the brackets - the result is the same.
We can also use symbols to make things clearer!
|We looked before at using the statement SEG to get the segment, and the offset with the OFFSET statement, but there is a shortcut!|
|We have two special commands for loading 'Far pointers'... these
set a full 20 bit address from a label.
Setting both the DS/ES segment register and AX/BX/etc with a single command.
|LDS has loaded in DS and BX,
LES has loaded in ES and AX
We can use AX,BX,CX,DX,SI,DI as the parameter, but only DS and ES as the segments - there's no LSS or LCS!
covered a lot of different addressing modes here very
quickly... you may be confused which registers can be used in
which modes -and when you can use them...
The best thing is to give it a go! Try changing the examples and see what works, and what doesn't!
3 - Loops,
Jumps and Conditions
We've looked at simple maths, and Addressing... now we need to learn how to cause our program to make decisions...
Lets learn how to jump around our code - and add conditions so the code can act in different ways depending on the values of the registers
|Lets use LOOP to effect a repeat...
This command uses CX as a loop counter... CX will be reduced, and a jump to the label specified will occur if CX is not zero...
We'll use s Monitor function to show the value s of registers and flags.
|The loop will run until CX=0|
|What was that JCNZ for... well it will jump out of the loop if
Without this if CX=0 when the loop starts, the first LOOP command would decrease CX to -1 (65535) ... this would cause the loop to occur 65 thousand times!
|There's are some alternative loop commands... LOOPNZ and
Like LOOP these use CX as a loop counter and will repeat until CX=0, but there's another thing that can end the loop...
LOOPZ will also end if the Zero flag is set
LOOPNZ will also end if the Zero flag is not set
It's important to understand that LOOP itself does not alter the Zero flag (Z) - so these commands allow the loop to end depending on the loop count CX reaching 0 OR the status of the Zero flag (Z)...
|The loop ended before CX reached Zero!
Why? because when AX reached zero, the Z flag was set, and this caused the loop to end due to the loopNZ command.
|Name||Meaning|| Command to set Flag
|| Command to clear Flag
|T||Trap||1=Cause INT2 every instruction
|D||Direction||Used for 'string' cunctions
|I||Interrupt enable||Allow maskable hardware interrupts
|O||Overflow||1=Overflow (sign changed)
|S||Sign||1=Negative 0=positive (top bit of reg)
|A||Aux carry||Used as Carry in BCD
|P||Parity||1=Even no of 1 bits (8 bit)
|C||Carry||1=Carry/Borrow caused by last ADD/SUB/ROT instruction
||STC (see CMC)||CLC|
worry about all the flags at this stage, - ones like Trap
Direction and Interrupt are not relevant to conditions
You won't need them in this example, and depending on the commands you use, you may not need to!
|JA / JNBE short-label||(Unsigned) above/not below nor equal||(CF AND ZF)=O|
|JBE / JNA short-label||(Unsigned) below or equal/ not above||(CF OR ZF)=1|
|JC / JB / JNAE short-label||(Unsigned) carry/below / not above nor equal||CF=1|
|JCXZ short-label||Jump If CX Zero (see loop)|
|JE / JZ short-label||equal/zero||ZF=1|
|JG / JNLE short-label||(Signed) greater/ not less nor equal||((SF XOR OF) OR ZF)=O|
|JGE / JNL short-label||(Signed) greater or equal/not less||(SF XOR OF)=O|
|JLE / JNG short-label||(Signed) less or equal/ not greater||((SF XOR OF) OR ZF)=1|
|JL / JNGE short-label||(Signed) less/not greater nor equal||(SF XOR OF)=1|
|JMP target||Jump to label (byte or word target)|
|JNC / JAE / JNB short-label||(Unsigned) above or equal/ not below||CF=O… AX>=cmp|
|JNE / JNZ short-label||not equal/ not zero||ZF=O|
|JNO short-label||not overflow||OF=O|
|JNP / JPO short-label||not parity / parity odd||PF=O|
|JNS short-label||not sign||SF=O|
|JP / JPE short-label||parity/ parity equal (bits 0-7 only)||PF=1|
|The Zero flag is set when the result of a command is zero...
This can happen for two reasons... if a subtraction/DEC command results in zero, or if the two compared values are equal
This is because CMP sets the flags the same as if a subtraction occurred, but only the flags are changed
if a register overflows back to zero, Z will also be set - so adding 1 to 0FFFFh will also result in a Z flag
|A Z will be shown to the screen by the jump if the Z flag was set|
|The compare commands we use will be different depending on if
our numbers are signed (-32768 to 32767) or unsigned (0-65535)
We have conditions for > < >= and <=
|You'll need to try different values to see the result|
|Equivalent Byte value||255||254||253||251||246||236||206||2||1|
|Equivalent Hex Byte Value||FF||FE||FD||FB||F6||EC||CE||2||1|
|Because whether a registers value is positive or negative is
undefined in the register itself, we have to use different
commands when working with signed numbers than unsigned ones...
We can specify -15 in our ASM code - the assembler will work out the equivalent byte value.
|We can also test the sign bit - if the top bit of the last
mathematical operation is 0 then the value is positive - if it's 1
then the value is negative.
the S bit will be set if the result is <0
|The Parity bit is a bit odd... The 1 bits are summed in the first byte... if there's an odd number P=0 (odd)... if there's an even number P=1 (even)|
|The 'Overflow' flag is used
to check if a signed register has become invalid...
an Unsigned register can store +32768 just fine - but in a signed register, this will effectively be -32768
This will cause a problem! so we have an overflow flag to check if this has happened.
|We've looked at a
large variety of commands here - but you REALLY need to try them
yourself before they'll make sense.
The example code above had many alternate test REMmed out - try unremming them!
4 - The Stack
We've looked as maths and logic, but there's a very important thing about the 8086 we've not covered yet... the Stack!
The stack is fundamental to most CPU's (not just x86) and is the way we temporarily store data that we can't keep in our registers... lets learn more.
|'Stacks' in assembly are like an
'In tray' for temporary storage...
Imagine we have an In-Tray... we can put items in it, but only ever take the top item off... we can store lots of paper - but have to take it off in the same order we put it on!... this is what a stack does!
If we want to temporarily store a register - we can put it's value on the top of the stack... but we have to take them off in the same order...
The stack will appear in memory, and the stack pointer goes DOWN with each push on the stack... so if it starts at $2000 and we push 2 bytes, it will point to $1FFE
on the 8086 we push bytes into the stack in pairs
example has a lot of 'tricks' we wont cover today that allow
the stack to be shown to the screen - normally a call would
use the stack
But we're using a fake stack so that only the push pop commands affect the shown stack - this is to allow the shown stack to only show the effect of the example commands - not the stack and register dump routines
|Lets look at an example of the stack!
We'll load AX with 1234h - and push it onto the stack
We'll then load AX with 5678h
Finally we'll pop AX off the stack - we'll show the state of AX at each stage.
|We loaded 1234h into AX
we then pushed it Onto the stack - it's reversed because the 8086 is little endian
We then load AX with 5678h
Finally we popped AX of the stack - getting back the value 1234h
|We can push multiple items onto the stack, and restore them back
in the same way.
The important thing is we take them off in the opposite order to how we pushed them onto the stack
In this case we push AX then BX - and pop off BX then AX
|We can see each item pushed on the stack was restored
Note we pushed AX (1234h) onto the stack first, then BX (ABCDh) - but BX comes before AX on the stack... this is because the stack pointer goes DOWN after each push
|We can reverse the order we pop them off the stack...
In this case we reversed the pop order of BX and AX
|AX and BX were reversed after the POPs
You don't want to do this by accident - but there will be times you will want to do it on purpose!
|Because of the way the
stack works, we're effectively nesting the pushes onto the
stack... lets make a clear example to really show this...
First we'll push 1234h, then 5678h then 9ABCh
we'll then pop them all off the stack
|The three values are pushed onto the stack and popped back
Again, because the stack moves backwards, the values on the stack are reversed
|We can push all the registers in this way, but we will sometimes
need to push the flags...
We need a special commands PUSHF and POPF for this purpose - they work in the exact same way as any other register.
If we want control over the main flags, we can transfer them to AH with SAHF , or transfer the flags to AH with LAHF
This only allows us access to the main flags: SZ-A-P-C
The flags are 16 bit, and both bytes are pushed onto the stack - in fact, this is a good way to set flags like the Trap flag (Flags:----ODIT), which cannot be directly set... we can push onto the stack, and pop back into AX and vice versa
In the example we push all the flags with PUSHF and pop them back into AX
|We were able to Push and Pop the flags
onto the stack..
We used SAHF to store AH to the flags... setting them all to one - and also to zero
We were also able to push the flags onto the stack and transfer them all into AX
|It's not just our code that uses the stack... in fact CALL
statements use the stack too... every time we run a call, we're
effectively pushing the RETurn address onto the stack...
When we use a RET statement, we're effectively popping the program counter (IP) off the stack...
Lets try it!... We're going to call a subroutine...
That subroutine will push BX onto the stack,
It will then run another subroutine twice, before restoring BX
Finally it will return... lets see the results!
|Due to the way the test code works, the return addresses aren't
quite the same on the stack as the Monitor dumps
The return address of the first TestNestedCall1 is pushed onto the stack,
Next BX will be pushed onto the stack...
TestNestedCall2 will be executed, it's return address will be pushed... the second call of TestNestedCall2 will cause it's return address to overwrite the first (as it's been popped off in the previous RETurn)
the end result? BX is unaffected by TestNestedCall2 due to the POP of BX in TestNestedCall1
have to be careful to remove everything your subroutine put on
the stack before the return... otherwise the RET command will
mistake one of your pushed values for the return address and
run something crazy!
DON'T SAY I DIDN'T WARN YOU!... but if you're super clever you can take advantage of things like this to do clever stuff!
|The RET command on the 8086 has a special trick... after the
return it can pop a number of bytes of the stack...
The reason for this is it's common to push parameters onto the stack before calling a function - the function will use those parameters, and this is a way of removing them.
In this example we'll push 4 bytes (2 words) onto the stack, and the function will load them to CX and DX, then return...
The RET statement will remove the 4 bytes
|CX and DX will receive the values pushed on to the stack|
|The monitor tools for this example used a lot of Macros - we're
not going to cover those here... but lets look at a simple macro
Macros contain multiple commands... then we can use the macro name in our code like a command!
A macro is different to a call... the Assembler REPLACES any reference to the macro with its contents.
This makes the code faster as there's no call, but bigger, as there will be duplication of the call contents.
macros allows us to do things a call cannot (in my tutorial code I swapped out the stack pointer so the call to the monitor would not affect the test stack)
Macro's can use parameters - these will be swapped out by the assembler - they're great for defining blocks of code we may want to use many times
|Here's the result.|
5 - Logical Operations,Bit Ops and Flags
We looked at basic maths before, but there are some more slightly complex commands that are fundamental to assembly.
We'll also look at some more Flag functions... Lets learn about them.
|Meaning||Invert the bits where the
mask bits are 1
|return 1 where both bits are1||Return 1 when either bit is 1|
Invert Bits that are 1
Keep Bits that are 1
Set Bits that are 1
|We'll try each of the commands on AX with some test values, showing the results at each stage|
|Our starting value is 1234
AND removed the bits that were zero in it's parameter (FF10), changing 1234 into 1210
OR set some of the bits (those that were 1 in it's parameter (7081), changing 1210 into 7291
XOR flips the bits that were 1 in it's parameter (0FF3) changing 7291 into 7D62
NOT flips all the bits, changing 7D62 into 829D
NEG flips all the bits and adds one, changing 829D into 7D63 (it was 7D62 before the NOT)
doesn't just alter the register - it also sets the flags
accordingly - so the Z flag will be set if the result is
If you want to set the flags in this way, but leave the register unchanged use TEST - it has the same effect on the flags as AND, but leaves the registers unchaged!
|There will be many times in assembly when we may want to shift
bits around within registers...
We may wish to process a byte one bit at a time, move a top nibble to a low nibble, and generally shift data around for the format we need.
bit shifting also allows simple multiplication, shifting to the left effectively doubles a value, shifting to the right effectively halves it.
We have a wide variety of bit shifting commands... on the 8086 we can only shift 1 bit at a time, on the 80186+ we can shift multiple bits!
ROR (ROtate Right) shifts the bits around a register to the right - no data is lost, any bits that leave the right hand side come back on the left.
RCR (Rotate through Carry Right) is similar, but the 'Carry' flag acts as an extra bit... when a bit is pushed off the right it goes into the carry, and the previous carry value becomes the leftmost bit... This is handy for processing data one bit at a time.
SHR (SHift logical Right) - this shifts bits to the right, new top bits are zero - any bits pushed off the right are lost.
SAR (Shift Arithmatic Right) - this also shifts bits to the right, again any bits pushed off the right are lost, but the top bit is the same as the last top bit - this means it can be used with negative numbers, and the sign won't change, we can use it to halve negative values!
|The result of each command can be seen here|
|Each commands has an equivalent Left shifting function
ROL (ROtate Left)
RCL (Rotate Carry Left)
SHL (SHift logical Left)
SAL (Shift Arithmatic Left) - with negative example
|The results of each command can be seen here.|
|While SAR and SHR are
different... SHL and SAL do the same thing! it doesn't matter
which you use - so don't worry!
|We looked at various tricks with flags in previous lessons, but
we didn't cover the range of commands to set and clear flags
arbitrarily... we have several options:
|Here are the results!|
how you're supposed to deal with the other flags? well there's
no commands to directly set or clear them!
You'll have to do something else - the best thing to do is push the flags onto the stack, alter them on the stack (or pop them into ax) and then pop them back SAFH probably won't help, as that only affects the bottom half of the flags register
6 - Carry for 32 bit, Multiplication, Division, Ports and
We've covered a lot of commands, but there's a few more complex ones we need to go over to do the 8086 justice,
Lets go over them now!
|Although our registers are only 16 bit, we can use a pair of
them together to make a 32 bit pair... one register will be the
Low part of the 32 bit pair, the other will be the High part
When we do addition or subtraction, the Carry flag can be used as a Carry or Borrow to add or remove from the High register... (the Carry flag functions as a borrow for subtraction)
We first do addition or subtraction from the Low Register (BX in this example) using the regular ADD or SUB... then we perform the addition or subtraction with the carry on the high part using ADC (add with Carry) or SBB (Subtract with Borrow)
If we don't want to add or subtract anything from the H you would do ADC ax,0 or SBB ax,0 - because you'd still need to apply any carry or borrow to the high register
|when the Carry flag is set.. the ADC or SBC adds or subtracts an extra 1 from the top register|
|Unlike many 8 bit processors, the 8086 has multiplication
commands... they can work with bytes or words, but the result is
always twice the size... there are two command, IMUL works with
signed numbers, MUL works with unsigned numbers...
if you use an 8 bit parameter (eg BL) then the command:
IMUL BL will perform AH*BL - returning the result in AX ( MUL BL would be the same)
if you use an 16 bit parameter (eg BX) then the command:
IMUL BX will perform AX*BX - returning the result in DX.AX - a 32 bit pair where DX is the High word, and AX is the low word (MUL BX would be the same)
|The results of each command are shown here.|
|When performing Division there are a couple of 'gochas' we have
to be ready for!
The first is the classic 'Division by zero' (if it takes 1 person 10 minutes to eat a cake, and 2 people 5 minutes - how long will the cake last if 0 people eat it?)... Division by zero causes Interrupt 0 and will lock the machine.
The other is 'Overflow)... if we Divide 1000 by 1, and the result is to be stored in a byte, it won't fit! this is called overflow and causes Interrupt 4
We should range check our parameters first!
Just like before IDIV works with signed parameters, and DIV works with unsigned ones
IDIV BL will perform AX / BL - returning the integer result in AL and the remainder in AH (DIV BL would be the same)
if you use an 16 bit parameter (eg BX) then the command:
IDIV BX will perform DX.AX / BX (where DX.AX is a 32 bit pair) - returning the integer result in AX and the remainder in DX (DIV BX would be the same)
|The results of each command are shown here.|
|'PORTS' are the connections from the main CPU to peripherals -
this is how we transfer data to and from these devices.
This example sends data OUT to port 42h (the speaker)... and reads it IN from port 61h (so we can enable the speaker bits,but leave the others alone)
We'll look more at the speaker example in a later lesson...
Another command we show here is NOP... this command does literally nothing, here we use it as a very crude delay... but it can also be used in self modifying code (code that alters it's own code)
|Interrupts are tasks which override our program and run
Hardware interrupts are where a device is taking control.. software interrupts occur for different circumstances (Like Division by zero), and we can even cause them ourselves with an INT command... (like a RST on Z80 / SWI on ARM or TRAP on 68000)
INTerrupts are used by DOS - and we can use them as OS calls to start DOS functions such as printing a string and returning to the OS
|Software interrupts call addresses from 0000:0000+... each uses
2 words - the first is the code segment of the interrupt handler,
the second is the address of the handler...
We can program a custom interrupt handler for INT4 (We need to use RETI to end the interrupt handler
Another interesting one is INT2 the 'Debugging Step Trap' - if we turn on the Trap flag this interrupt will occur every command - it's intended for trace debugging.
|Our INT4 ran twice... and the Step trap shows the changes of AX while the Trap flag was on|
|What ports and
interrupt numbers do is a mystery - it all depends on the
machine setup - you'll need to check the documentation of the
machine to understand using the OS Interrupt numbers, and what
the ports do with the attached hardware.
7 - Strings and stuff!
We've covered lots of commands now, but we've been overlooking some of the most powerful... remember the weird SI and DI registers that we saw at the start, that don't quite work like the general one?
We'll these are for something called 'Strings' - nothing to do with text (although they could be), these commands perform fast sequential operations!... Lets put them to work!
||DS:SI||ES:DI||Compare bytes between source and destination (Use REPZ / REPNZ)|
||DS:SI||Load a byte from the source|
||DS:SI||ES:DI||Move Data from Source to Destination in Words or Bytes|
|Scan Destination for AX (Use REPZ / REPNZ)|
||AX||ES:DI||Set bytes to AX/AL|
functions are like the LDI command on the Z80 - they do a job
then stop, so we can do some extra processing
Adding REP is like LDIR - and processes the string repeatedly until CX reaches zero
|MOVSB/W will move a sequence of bytes
or words from DS:SI to ES:DI...
we can use REP to repeat - this will copy CX bytes or CX words.
we can use STD to reverse the direction of the copy
If we want to store a sequence of the same byte or word, we can use STOSB/W instead.
|We copied 3 bytes from the source to the destination in the first example|
|in the second screenshot I enabled STD, reversing the procedure,, and used REP STOSW to copy AX to a range of 3 words|
|CMPSB/W compares two sequences, one at
DS:SI and one at ES:DI
We needs to use REPZ or REPNZ to repeat it - and set CX.
REPZ will continue until the strings no longer match
REPNZ will continue until the strings start to match.
|The routine scanned the string until
they stopped matching (and the Zero flag stopped being set)...
DI points to the following byte after the routine ends... also note CX did not reach zero
|SCAS will scan a string and compare it
to AX (or AL)
We can use REPZ to scan until the bytes don't match AX/AL
or REPNZ to scan until the bytes do match AX/AL
|We set the Direction flag with STD so we went backwards... scanning words until we found one that didn't match AX - DI then points to the word before|
|If we're looking to process bytes or words from a sequence we
can use LODS... like the z80 ldi command, this can be used as a
quick way of reading in from a range and performing commands on
those read in bytes or words.
In this example we'll read in bytes with LODSB, then words with LODSW from the string.
this command can be used with REP, but I'm not sure what the purpose would be - as SCAS can be used for scanning, and the repeat will not do anything with the read in data.
|We loaded in 3 Bytes and then 3 words|
|XLAT is a translation command - it uses a lookup table in
[DS:BX] and loads AL with the value at offset AL (AL = [DS:BX+AL])
In this example we use XLAT to convert a number to a pair of nibbles with that number...
Not particularly useful, but it will show what the command does.
|the XLAT command has converted AL according to the look up table|
we've covered all the major 8086 commands!
We should be able to make a decent effort at some programming now... Join in on the platform specific series next, In which we'll learn about the DOS and WonderSwan hardware!
|Mnemonic||Description||Example||Valid Regs||Flags affected|
|AAA||ASCII Adjust for Addition. Treats AL as an unpacked binary coded decimal number||AAA||o s z A p C|
|AAD||ASCII Adjust for Division. AL=AL+(AH*10), AH=0.||AAD||o S Z a P c|
|AAM||ASCII Adjust for Multiplication. We can use the normal MUL command then use AAM||AAM||o S Z a P c|
|AAS||ASCII Adjust for Subtraction. This treats AL as an unpacked binary coded decimal number||AAS||o s z A p C|
|ADC dest,src||Add src and the carry flag to dest.||ADC CX,1000h||O S Z A P C|
|ADD dest,src||Add src to dest.||ADD CX,1000h||O S Z A P C|
|AND dest,src||Logical AND of bits in dest with Accumulator scr.||AND AX,1100h||O S Z A P C|
|CALL dest||Call Subroutine at address dest.||CALL 1000h||- - - - - -|
|CBW||Convert the 8 bit byte in AL into a 16 bit word in AX.||CBW||- - - - - -|
|CLC||Clear the Carry Flag. C flag will be set to Zero.||CLC||- - - - - C|
|CLD||Clear the Direction Flag. D flag will be set to Zero. This is used for 'String functions'.||CLD||D - - - - - -|
|CLI||Clear the Interrupt enable flag. I flag will be set to 0. This disables maskable interrupts.||CLI||I - - - - - -|
|CMC||Complement the Carry flag. If C=1 it will now be 0. If it was 0 it will now be 1.||CMC||- - - - - C|
|CMP dest,src||Compare the Byte or Word dest to src. This sets the flags the same as "SUB dest,src" would.||CMP AL,32||O S Z A P C|
|Compare DS:SI to ES:DI. This command can work in bytes or words. Sets flags like CMP||REPZ CMPSB||O S Z A P C|
|CWD||Convert the 16 bit word in AX into a 32 bit doubleword in DX.AX. This 'Sign Extends' AX||CWD||- - - - - -|
|DAA||Decimal Adjust for Addition. This treats AL as a packed binary coded decimal number.||DAA||O S Z A P C|
|DAS||Decimal Adjust for Subtraction. This treats AL as a packed binary coded decimal number.||DAS||O S Z A P C|
|DEC Dest||Divide Unsigned number AX or DX.AX by src.||DEC AL||O S Z A P -|
|DIV src||Divide Unsigned number AX or DX.AX by src. AL=AX/src (8 bit) or AX=DX.AX/src (16 bit)||DIV CX||o s z a p c|
|ESC #,src||This command is for working with multiple processors - it's not something you will need.||ESC 1,AH||- - - - - -|
|HLT||Stop the CPU until an interrupt occurs||HLT||- - - - - -|
|IDIV src||Divide Signed number AX or DX.AX by src. AL=AX / src (8 bit) or AX=DX.AX / src (16 bit)||IDIV CX||o s z a p c|
|IMUL src||Multiply Signed number AX or DX.AX by src. AX=AL*src (8 bit) or DX.AX=AX*src (16 bit)||IMUL CX||O s z a p C|
|IN dest,port||Read in an 8 bit byte or 16 bit word into dest (either AX or AL). Use DX for 16 bit port||IN AX,F0h||- - - - - -|
|INC Dest||Increase Dest by one. This is faster than using ADD with a value of 1.||INC AL||O S Z A P -|
|INT #||Causes software interrupt #. The flags are pushed onto the stack before call||INT 33h||- - - - - -|
|INTO||INTO will cause Interrupt 4 if the Overflow flag (O) is set, otherwise it will have no effect.||INTO||- - - - - -|
|IRET||Restore the flags from the stack and return from an Interrupt.||IRET||O S Z A P C|
|Jcc addr||Jump to 8 bit offset addr if condition cc is true.||JO ErrorHandle||- - - - - -|
|JCXZ addr||Jump to 8 bit offset addr if CX=0.||JCXZ NoLoop||- - - - - -|
|JMP addr||Jump to address addr.||JMP BX||- - - - - -|
|LAHF||Load AH from the Flags. This only transfers the main flags: SZ-A-P-C||LAHF||- - - - - -|
|LDS reg,addr||Load a full 32 bit pointer into DS segment register and register reg.||LDS BX,TestPointer||AX, BX, CX, DX, SI, DI||- - - - - -|
|LEA reg,src||Load the effective address src into reg.||LEA CX,[BX+DI]||AX, BX, CX, DX,SI, DI, BP, SP||- - - - - -|
|LES reg,addr||Load a full 32 bit pointer into ES segment register and register reg.||LES AX,MyLabel||AX,BX,CX,DX,SI,DI||- - - - - -|
|LOCK||Enable the LOCK signal. This is for multiprocessor systems.||LOCK||- - - - - -|
|LODSBLODSW||Load from DS:SI into AX or AL. This command can work in bytes or words.||LODSB||- - - - - -|
|LOOP addr||Decrease CX and jump to label addr if CX is not zero.||LOOP LoopLabel||- - - - - -|
|Decrease CX and jump to label addr if CX is not zero and the Zero flag is not set.||LOOPNZ LoopLabel||- - - - - -|
|Decrease CX and jump to label addr if CX is not zero and the Zero flag is set.||LOOPZ LoopLabel||- - - - - -|
|MOV dest,src||Move a value from source src to destination dest||MOV AX,BX||- - - - - -|
|Move a byte or word from DS:SI to
ES:DI. (Like Z80 LDIR)
This command can be combined with repeat command REP, to repeat CX times.
|REPZ MOVSB||- - - - - -|
|MUL src||Multiply unsigned number AX or DX.AX by src.AX=AL*src (8 bit) or DX.AX=AX*src (16 bit)||MUL CX||O s z a p C|
|NEG dest||Negate dest (Twos Complement of the number).||NEG AL||- - - - - -|
|NOP||No Operation. This command has no effect on any registers or memory.||NOP||- - - - - -|
|NOT dest||Invert/Flip all the bits of dest.||NOT dest||- - - - - -|
|OR dest,src||Logically ORs the src and dest parameter together.||OR AX,BX||O S Z a P C|
|OUT port,src||Send an 8 bit byte or 16 bit word from src (either AX or AL) to hardware port number port. Use DX for 16 bit port||OUT 100,AL||- - - - - -|
|POP reg||Pop a pair of bytes off the stack into 16 bit register reg.||POP ES||AX, BX, CX, DX, SI, DI, SP, BP, CS, DS, ES, SS||- - - - - -|
|POPF||Pop a pair of bytes off the stack into the 16 bit Flags register.||POPF||O D I T S Z A P C|
|PUSH reg||Push a pair of bytes from 16 bit register reg onto the top of the stack.||PUSH AX||- - - - - -|
|PUSHF||Push a pair of bytes off the stack into the 16 bit Flags register.||PUSHF||- - - - - -|
|RCL dest,count||Rotate bits in Destination dest to the Left by count bits, with the carry flag acting as an extra bit.||RCL AX,1||O - - - - C|
|RCR dest,count||Rotate bits in Destination dest to the Right by count bits, with the carry flag acting as an extra bit||RCR AX,1||O - - - - C|
|REP stringop||Repeat string operation stringop while CX>0. Decrease CX after each iteration||REP LODSW||- - - - - -|
|Repeat string operation stringop while the Z flag is set and CX>0. Decrease CX each time||REPZ CMPSB||- - - - - -|
|Repeat string operation stringop while the Z flag is not set and CX>0. Decrease CX each time||REPNZ CMPSB||- - - - - -|
|RET||Return from a subroutine.||RET||- - - - - -|
|ROL dest,count||Rotate bits in Destination dest to the Left by count bits||ROL AX,1||O - - - - C|
|ROR dest,count||Rotate bits in Destination dest to the Right by count bits||ROR AL,1||O - - - - C|
|SAHF||Store AH to the Flags. This only transfers the main flags: SZ-A-P-C .||SAHF||- S Z A P C|
|SAL dest,count||Shift the bits for Arithmetic in Destination dest to the Left by count bits.||SAL AX,1||O - - - - C|
|SAR dest,count||Shift the bits for Arithmetic in Destination dest to the Right by count bits.||SAR AX,1||O - - - - C|
|SBB dest,src||Subtract src and the Borrow (carry flag) from dest.||SBB AL,BL||O S Z A P C|
|Scan ES:DI and compare to AX or AL. This command can work in bytes or words. (Like CMP)||REPZ SCASB||O S Z A P C|
|SHL dest,count||Shift the bits logically Left in destination dest by count bits.||SHL AX,1||O - - - - C|
|SHR dest,count||Shift the bits logically Right in destination dest by count bits.||SHR AX,1||O - - - - C|
|STC||Set the Carry Flag. C flag will be set to 1.||STC||- - - - - C|
|STD||Set the Direction Flag. D flag will be set to 1. This is used for 'String functions'.||STD||D - - - - - -|
|STI||Set the Interrupt enable flag. I flag will be set to 1. This enables maskable interrupts.||STI||I - - - - - -|
|Store AX or AL to ES:DI. This command can work in bytes or words.||REP STOSB||- - - - - -|
|SUB dest,src||Subtract src from dest.||SUB AX,BX||O S Z A P C|
|TEST dest,src||Test dest, setting the flags in the same way a logical "AND src" would. Dest unchanged||TEST BX,64h||O S Z A P C|
|WAIT||Wait until the busy pin of the CPU is inactive.||WAIT||O S Z A P C|
|XCHG reg1,reg2||Exchange the contents of registers reg1 and reg2.||XCHG BH,AL||- - - - -|
|XLAT||Translate AL using lookup table DS:BX. AL is read from memory address [DS:BX+AL].||XLAT||- - - - - -|