casm is a simple, portable multi-pass assembler

Copyright (C) 2003-2015 Ian Cowburn

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/gpl-3.0.html)

Index

CASM - General usage

Z80 - Z80 processor support.

6502 - 6502 processor support.

Gameboy Z80 - The Gameboy Z80 derivative processor.

CASM

Usage

casm file

Assembles file, and places the resulting code in a file called output by default.

Note that switches aren't used by casm. Instead options are controlled by commands in the source file.

If you type the command without an argument, usage, version and license info is displayed.

Memory Layout

There is 64K of RAM that the assembler will generate output into. Extra 64K banks of RAM can be added by using the bank or org directives. Banks are numbered from zero upwards.

Source Code Layout

The source files follow this basic format:
    ; Comments
    ;
    label1: equ     0xffff

            org     $4000;
            
            db      "Hello, World\n",0

    main    jp      local_label     ; Comments

    .local_label
            inc     a

    another:
            inc     b
            jp      local_label     ; Actually jumps to the following
                                    ; local_label.

    .local_label
            ret

The source files follow the following rules:

Recognised directives

The following directives are also recognised with an optional period (.) in front of them, and are case insensitive. Directives can also be used to control the output of a program listing, and the output of the assembly itself. These are documented in subsequent sections.

Directive Description
processor cpu Sets the processor type to CPU. If omitted then Z80 is the default. Note that this can appear multiple times in the same file. See the later sections on processors to see what values are supported.

option setting, value Set options. Options are defined later on, and each CPU and output driver can also have its own options. For options that support booleans (on/off/true/false),the setting can be prefixed with a plus or minus character to switch it on or off respectively.
equ value Sets the top level label to value. Note this requires a label on the same line as this directive.
org value[, bank] Sets the program counter (PC) to value. The PC defaults to zero on initialisation. If the optional second argument is passed the current memory bank in use is set to bank. Note that it also possible to set the bank by passing a 24-bit address. This was added for convenience when using the 65c618 processor.
bank value The current memory bank in use is set to value.
ds value[, fill] Skips the program counter on value bytes. If the optional fill is provided then the bytes are filled with fill, otherwise they are filled with zero.
db value[, value] Writes bytes represented by value to the current PC. The values can be constants, expressions, labels or strings which are expanded to a list of byte values for each character in the string.
dw value[, value] Writes words (16-bit values) represented by value to the current PC. The values can be constants, expressions, labels or strings. Strings are written as 16-bit versions of their byte values, i.e. the high byte will be zero and the low byte the character code.
align value[, fill] Align the PC so that (PC modulus value) is zero. Will error if value is less than 2 or greater that 32768. No values are written to the skipped bytes unless the optional fill is supplied.
include filename Includes the source file filename as if it was text entered at the current location.
incbin filename Includes the binary file filename at the current PC, as if it was a sequence of db directives with all the bytes from the file.
alias command, replacement Creates an alias so that whenever the command command is found in the source it is replaced with replacement. The idea of this is to make it easier to import sources that use unknown directives, e.g.
    alias setaddr,org
    alias ldreg,ld

    cpu         z80

    setaddr     $8000   ; These two are
    org         $8000   ; equivalent.

    ld          a,(hl)  ; These two are
    ldreg       a,(hl)  ; equivalent.
nullcmd Simply does nothing. It's only real use is as an alias if you wished to strip a directive from a foreign source file.
end Terminates input processing. Anything past the directive will be ignored.

Built-in Aliases

The following are built-in aliases for the above directives.
Command Built-in Alias
processor proc
arch
cpu
equ eq
ds defs
db defb
byte
text
dw defw
word

Expressions

In any of the directives above, where a value is defined, an expression can be entered.

Assembly instructions will also permit these expressions to be used where applicable. As many opcodes use parenthesis to indicate addressing modes, remember that {} brackets can be used to alter expression precedence.

    ld  a,{8+2}*2               ; On the Z80 loads A with the value 20
    ld  a,({8+2}*2)             ; On the Z80 loads A with the value stored at
                                ; address 20

Note that the expression is evaluated using a standard C int, and then cast to the appropriate size.

The following formats for constant numbers are supported:

Format (regular expression) Description
"." or '.' A single quoted character will be converted into the appropriate character code.
[1-9][0-9]* A decimal number, e.g. 42.
0[0-7]* An octal number, e.g. 052.
0x[0-9a-fA-f]+ A hex number, e.g. 0x2a.
[0-9a-fA-f]+h A hex number, e.g. 2ah.
$[0-9a-fA-f]+ A hex number, e.g. $2a.
[01]+b A binary number, e.g. 00101010b
%[01]+ A binary number, e.g. %00101010
[a-zA-Z_0-9]+ A label, e.g. `main_loop`.
The following operators are understood. The order here is the order of precedence.
Arithmetic Operators Description
{ } Brackets used to alter the order of precedence. Note normal parenthesis aren't used as the assembly language may make use of them.
~ + - Bitwise NOT/unary plus/unary minus.
<< >> Shift left/shift right.
/ * Division/multiplication
+ - Addition/subtraction.
All the following have the same precedence, and so will be done left to right.
Comparison Operators Description
== Equality. Returns 1 if the arguments are equal, otherwise zero.
!= Inequality. Returns 1 if the arguments are unequal, otherwise zero.
< <= > >= Less than/less than or equal/greater than/greater than or equal. Returns 1 if the expression is true, otherwise zero.
All the following have the same precedence, and so will be done left to right.
Boolean Operators Description
&& & Boolean/bitwise AND. For boolean operation arguments, zero is FALSE, otherwise TRUE.
|| | Boolean/bitwise OR.
^ Bitwise XOR.

Character Sets

The assembler has built-in support for a few different character sets. These can be set by using the options `charset` or `codepage`, i.e.

    option codepage, format
    option charset, format

The following values can be used for format.

Format Description
ascii 7-bit ASCII. This is the default.
spectrum The character codes as used on the Sinclair ZX Spectrum.
cbm PETSCII as used on the Commodore Business Machine's range from the PET to the C128. See https://en.wikipedia.org/wiki/PETSCII more details.
zx81 The character codes as used on the Sinclair ZX81. Lower case letters are encoded as normal upper case letters and upper case letter will be encoded as inverse upper case letters. In addition the following characters that have no corresponding ZX81 character are mapped as:
# The British Pound sign.
' Inverse double quotes.
\ Inverse slash.
! Inverse question mark.
| Inverse space.
~ Inverse minus sign.
{ } Inverse round brackets.
` The newline character. Note that the newline is actually a HALT opcode used to terminate the line.

e.g.

    option  +list
    option  +list-hex

    option  charset,ascii
    db      "Hello",'A'
    ; $48 $65 $6C $6C $6F $41

    option  charset,zx81
    db      "Hello",'A'
    ; $AD $2A $31 $31 $34 $A6

    option  codepage,cbm
    db      "Hello",'A'
    ; $48 $45 $4C $4C $4F $41

    option  codepage,spectrum
    db      "Hello",'A'
    ; $48 $65 $6C $6C $6F $41

Macros

Macros can be defined in one of two ways; either parameterless or with named parameters. Macro names are case-insensitive. In the parameterless mode the special identifier '*' can be used to expand all arguments, which will be separated with commas.

When expanded the macro will have an internally generated top-level label assigned to it, so local variables will work inside the macro.

e.g.

macro1: macro

        ld a,\1
        ld b,\2
        ld hl,data
        call \3
        jr dataend
.data
        defb \*
.dataend

        endm

macro2: macro char,junk,interface

        ld a,@char
        ld b,@junk
        call @interface

        endm

Note that trying to expand and unknown/missing argument will be replaced with an empty string. Also the two argument reference styles can be mixed, though obviously the @ form only makes sense in a parameterised macro, e.g.

mac:    macro char,junk,interface

        ld a,@char
        ld b,\2
        call @interface

        endm

The at symbol (@) used for parameter expansion in named argument macros can be replaced by using the following option, e.g.

        option  macro-arg-char,&

Note that this is enforced when the macro is used not when it is defined. Also the character must not be quoted, as that will be parsed as a string holding the character code of the character.

Output Format

By default the assembled code is written to a file called output as raw binary. The generated output can be controlled with the following options.
Output Option Description
option output-file, file Send the output to file. Defaults to output.
option output-bank, printf formatted filename Send the output if multiple banks to use to printf formatted filename. It defaults to output.%u and accepts just one argument in the formatting string of an unsigned integer. If more or a different format specifier is used the behaviour of the assembler will be undefined. How this is used depends on the output driver.
option output-type, format Controls the format of the output file. The following are the supported output formats:
raw A raw binary image.
spectrum A Spectrum emulator TAP file.
zx81 A ZX81 emulator P file.
t64 A Commodore 64 T64 tape file.
gameboy A Nintendo Gameboy ROM file.
The output formats are described in detail in the following sections.

RAW Output Format

In this mode the file is created covering the block of memory that the assembly touched. If memory banks have been used then the output-bank setting is used to generate the output filename.

Spectrum TAP Output Format

Generates a Spectrum TAP file for an emulator. A TAP file is a simple binary file holding the bytes that the real Spectrum would have written to a tape.

The TAP file will be given the same name as the output filename, and the internal code block will also be given the same name, unless memory banks have been used, in which case each code file in the TAP file will use the output-bank setting to generate the filename for each block.

Remember that TAP files can be concatenated, so the output could be appended to another TAP file containing a BASIC loader, for example.

ZX81 .P Output Format

Generates a P-file for an emulator. A ZX81 .P file is simply a dump of memory from the system variables onwards.

This format does not support memory blocks (the .P file is not a container format) and so will only output anything in the first bank used, and using the output-file for the filename.

The output file will be created as a BASIC program, containing a REM statement holding the machine code, and a command to execute the code. As such the output will fail if the code in Bank 0 does not start at address 16514. Your code must also support being executed from this address.

Another important thing to note is about the display file. In the ZX81 memory map the DFILE can move around depending on the size of the program. The output driver will create a display file for you. The easiest way to reference this is to read the DFILE system variable when your program starts.

Alternatively you can just as easily set up your own display file once your program starts if you have special requirements, e.g. a display file for pseudo hi-res.

ZX81 .P Output Format options

The ZX81 output driver supports the following settings that can be set via an option command.

Option Description
option zx81-margin, <pal|ntsc> Sets the MARGIN system variable appropriately either for the PAL or NTSC TV system. Defaults to PAL.
option zx81-autorun, <on|off> Whether the ZX81 should auto run the machine code. Defaults to on.
option zx81-collapse-dfile, <on|off> Whether the display file should be generated collapsed (e.g. for 1K mode). Defaults to off.

C64 T64 Tape Output Format

Generates a T64 tape file for an emulator.

The tape file will be given the same name as the output filename, and the internal code block will also be given the same name, unless memory banks have been used, in which case then each entry in the tape file will use the output-bank setting to generate the filename for each entry.

The first (or only) bank will have a small BASIC program inserted as part of the generated file. For this reason the first bank should start near the BASIC area (0x820 should be a safe place to start) unless you have a great desire for a tape full of zero bytes. This BASIC will simply hold a SYS command to start the machine code, e.g.

10 SYS 2080

Any remaining blocks will be stored as-is without any basic loader.

Nintendo Gameboy ROM File Output Format

Generates a ROM file for a Gameboy emulator or hardware. Note that large ROM sizes have not been extensively checked and verified.

If a single bank was used during the assembly then a simple 32K ROM is assumed, and an error will be shown if the addresses used fall outside the range 0x150 to 0x7fff.

Similarly if multiple banks are used then it is assumed that the first bank is only used in the range 0x150 to 0x3fff, and subsequent banks in the range 0x4000 to 0x7fff. This is to fit in with the method the Gameboy uses to page memory banks into the upper 16K portion of the normal 32K ROM address space.

By default the output driver will try and fill in the ROM size and type in the header properly, but these can be overridden using settings.

Gameboy ROM Output Format options

The Gameboy output driver supports the following settings that can be set via an option command.

Option Description
option gameboy-colour, <on|off> Whether this is a Gameboy Colour cartridge. Defaults to off. Note that gameboy-color can be used as a different spelling for this setting.
option gameboy-super, <on|off> Whether this is a Gameboy Super extended cartridge. Defaults to off.
option gameboy-cart-type, number Specifies the cartridge type. Defaults to -1, which means the output driver will try and pick the appropriate type.
option gameboy-irq, irq, address; Specifies an address where an IRQ routine is stored, and requests that the IRQ be transferred to that address when it happens. The IRQ routine must end with a reti opcode.

irq can be either vbl, lcd, timer, serial or joypad. If left at the default value of -1 then an IRQ handler is installed with just a reti instruction.

Listing

By default no output listing is generated. This can be controlled by the following options.

Listing Option Description
option list, <on|off> Enables or disables listing. The listing will go to stdout by default. Defaults to off.
option list-file, filename Sends the listing to filename. Note this should appear before enabling the listing.
option list-pc, <on|off> Control the output of the current PC in the as a comment preceding the line (so that a listing could be reassembled with no editing). Defaults to off.
option list-hex, <on|off> Control the output of the bytes generated by the source line in hex. Defaults to off. If on then the hex is output in a comment preceding the line (possibly with the PC above), so that a listing is still valid as input to the assembler.
option list-labels, <on|off|all> Controls the listing of labels, either:
off The default; don't list anything.
on List labels at the end of the listing. The labels are output commented so that the list could be used as input.
all List all labels, including internally generated private labels for macros.
option list-macros, <off|exec|dump|all> Controls the listing of macro invocations, either:
off The default; don't list anything.
exec List invocations of macros.
dump Produce a list of macro definitions at the end of the listing.
all Combine exec and dump.
option list-rm-blanks, <on|off> Defaults to on. This option causes multiple blank lines to be collapsed down to a single blank line in the listing.

Z80 CPU

Opcodes

The Z80 assembler uses the standard Zilog opcodes, and supports undocumented instructions.

For instructions were the Accumulator can be assumed it can be omitted, and EOR can be used the same as XOR:

        xor     a,a         ; These are equivalent
        xor     a
        eor     a,a

        and     a,b         ; These are equivalent
        and     b

For exchange opcodes with parameters the parameters can be reversed from their official form:

        ; The official forms
        ;
        ex      de,hl
        ex      af,af'
        ex      (sp),hl
        ex      (sp),ix
        ex      (sp),iy

        ; Also supported
        ;
        ex      hl,de
        ex      af',af
        ex      hl,(sp)
        ex      ix,(sp)
        ex      iy,(sp)

Where the high/low register parts of the IX and IY registers are to be used, simply use ixl, iyl, ixh and iyh. Note that the assembler will accept illegal pairings involving H and L, but these will be warned about:

        ld  ixh,$e5
        ld  iyl,iyl

        ld  ixh,l           ; This will be turned into "ld ixh,ixl" and a
                            ; warning will be issued.

        ld  iyh,ixl         ; This will generate an error as the index registers
                            ; have been mixed.

For the hidden bit manipulations that also can copied to a register, these can be represented by adding the destination register as an extra parameter, e.g.

        srl (iy-1),d
        set 3,(iy-1),a
        res 4,(iy-1),b

For the hidden IN instruction using the flag register the following are all equivalent:

        in  (c)
        in  f,(c)

For the hidden OUT instruction using the flag register, $00 or $ff depending on where you're reading, the following are all equivalent, where value can be any value at all:

        out (c)
        out (c),f
        out (c),value

Options

The Z80 assembler has no options.

6502 CPU

Opcodes

The 6502 assembler uses the standard Motorola opcodes.

Options

The 6502 assembler has the following options.
6502 Option Description
option zero-page, <on|off|auto> Controls the assumptions made regarding Zero Page address. Defaults to auto, and can be the following values:
off The default; all addresses are assumed to be not on the Zero Page, regardless of the value used.
on Assumes all addresses are in the Zero Page, raising an error if any address is not in the Zero Page.
auto Treats addresses less than 256 as being in the Zero Page automatically. This mode also makes the assembler perform an extra pass to guard against the possibility of the calculation being fooled.
e.g.
        cpu     6502
        org     $8000

        lda     $0000,x     ; Produces $bd $00 $00
        option  +zero-page
        lda     $0000,x     ; Produces $b5 $00
        lda     $1234,x     ; Produces an error

        option  zero-page,auto
        lda     $00,x       ; Produces $b5 $00
        lda     $8000,x     ; Produces $bd $00 $80

Gameboy Z80 derivative CPU

Opcodes

The Gameboy assembler uses the standard Z80 opcodes where applicable. Note that the Gameboy processor has a reduced number of opcodes, flags and no index registers, though it has some additional instructions and addressing modes.

For instructions were the Accumulator can be assumed it can be omitted, and EOR can be used the same as XOR:

        xor     a,a         ; These are equivalent
        xor     a
        eor     a,a

        and     a,b         ; These are equivalent
        and     b

The Gameboy CPU has a special addressing mode used for one opcode, where the referenced address is stored as a single byte, and used as an offset into the top page (0xff00). This can be either triggered by using the special opcode, or will automatically used whenever an address is accessed in the range 0xff00 to 0xffff:

        ; These all will use the special addressing mode opcode, accessing
        ; memory location $ff34
        ;
label   equ     $ff34

        ldh     a,($34)
        ldh     a,($ff34)
        ld      a,($ff34)
        ld      a,(label)

        ld      (label),a
        ld      ($ff34),a
        ldh     ($34),a
        ldh     ($ff34),a

The Gameboy CPU also supports incrementing or decrementing the HL register when it is used as an address:

        ; All these decrement HL after the value has been used.
        ;
        ld      a,(hl-)
        ld      a,(hld)
        ldd     a,(hl)
        ld      (hl-),a
        ld      (hld),a
        ldd     (hl),a

        ; All these increment HL after the value has been used.
        ;
        ld      a,(hl+)
        ld      a,(hli)
        ldi     a,(hl)
        ld      (hl+),a
        ld      (hli),a
        ldi     (hl),a

In addition the Gameboy CPU supports these extra instructions over the Z80:

        ; Actually loads using the address $ff00 + C
        ;
        ld      a,(c)
        ld      (c),a

        ; Put the Gameboy into a low-power mode till a control is pressed.
        ; Note it is accepted practice to put a NOP afterwards.  This may be
        ; due to the stop replacing DJNZ, which may still be wired to expect
        ; an argument.  That is just a wild guess though.
        ;
        stop
        nop

        ; Swaps the low/high nibbles of the register
        ;
        swap    a
        swap    b
        swap    c
        swap    d
        swap    e
        swap    h
        swap    l
        swap    (hl)

Options

The Gameboy CPU assembler has no options.