aboutsummaryrefslogtreecommitdiff
path: root/doc/manual.asciidoc
blob: 3a20b69f8eaa813d9965a4830c83b4d4a2141672 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587

CASM
====

A simple, portable multi-pass assembler

Copyright (C) 2003-2015  Ian Cowburn <ianc@noddybox.co.uk>

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see http://www.gnu.org/licenses/gpl-3.0.html

Usage
-----


----
casm file
----

Assembles file, and places the output in _output_ by default.

Memory Layout
-------------

There is 64K of RAM that the assembler will generate output into.  Extra 64K
banks of RAM can be added by using the 'bank' or 'org' directives.  Banks are
numbered from zero upwards.


Source Format Example
---------------------

The source files follow this basic format:

----
; Comments
;
label1: equ     0xffff

        org     $4000;
        
        db      "Hello, World\n",0

main    jp      local_label     ; Comments

.local_label
        inc     a

another:
        inc     b
        jp      local_label     ; Actually jumps to the following local_label.

.local_label
        ret
----


The source files follow the following rules:

* Any text past a semicolon (;) is discarded as a comment (except when part
  of a string constant).

* Labels must start in column zero (the left hand most column).

  ** If the label ends with a colon (:) then the colon is removed.

  ** If the label doesn't start with a period (.) then it is assumed a global
     label.

  ** If the label starts with a period (.) then it is assumed to be a local
     label.  Local labels are associated with the preceding global label.  If a
     global label and related local label have the same name, the local label
     will be used on expansion.

  ** Any label can be followed by an 'equ' directive, in which case the label
     is set to that value rather than the current program counter.

  ** Labels are case-insensitive.

* Directives and opcodes must appear further along the line (anywhere else
  other than the left hand column where labels live basically).

* Strings can either be quoted with single or double quotes; this allows you to
  put the other quote type inside the string.


Recognised directives
~~~~~~~~~~~~~~~~~~~~~

All directives are also recognised with an optional period (.) in front of
them, and are case insensitive.  Directives can also be used to control the
output of a program listing, and the output of the assembly itself.  These are
documented in further sections.


processor _CPU_::
    Sets the processor type to _CPU_.  If omitted then Z80 is the default. 
    Note that this can appear multiple times in the same file.  Currently
    supported _CPU_ values are +Z80+ and +6502+.

option _setting_, _value_::
    Set options.  Options are defined later on, and each CPU can also have its
    own options.  For options that support booleans (on/off/true/false),
    the _setting_ can be prefixed with a plus or minus character to switch it
    on or off respectively.

equ _value_::
    Sets the top level label to _value_.  Note this requires a label on the
    same line.

org _value_[,_bank_]::
    Sets the program counter (PC) to _value_.  The PC defaults to zero.  If the
    optional second argument is passed the current memory bank in use is set
    to _bank_.

bank _value_::
    The current memory bank in use is set to _value_.

ds _value_[, _fill_]::
    Skips on the program counter _value_ bytes.  If the optional _fill_ is
    provided then the bytes are filled with _fill_, otherwise they are filled
    with zero.

db _value_[, _value_]::
    Writes bytes to the current PC.  The values can be constants, expressions,
    labels or strings.  Built-in aliases are +byte+ and +text+.

dw <value>[, <value>]::
    Writes words (16-bit values) to the current PC.  The values can be
    constants, expressions or labels.  Note that +word+ is a built-in alias for
    this directive.

align _value_[, _fill_]::
    Align the PC so that (PC modulus _value_) is zero.  Will error if _value_
    is less than 2 or greater that 32768.  No values are written to the skipped
    bytes unless the optional _fill_ is supplied.

include _filename_::
    Includes the source file _filename_ as if it was text entered at the
    current location.

incbin _filename_::
    Includes the binary data in _filename_ at the current PC, as if it was a
    sequence of +db+ directives with all the bytes from the file.

alias _command_, _replacement_::
    Creates an alias so that whenever the command _command_ is found in the
    source it is replaced with _replacement_.  The idea of this is to make it
    easier to import sources that use unknown directives, e.g.

    alias setaddr,org
    alias ldreg,ld

    cpu         z80

    setaddr     $8000   ; These two are
    org         $8000   ; equivalent.

    ld          a,(hl)  ; These two are
    ldreg       a,(hl)  ; equivalent.

nullcmd::
    Simply does nothing.  It's only real use is as an alias if you wished to
    strip a directive from a foreign source file.

end::
    Terminates the input processing.  Anything past the directive will be
    ignored.


Expressions
~~~~~~~~~~~

In any of the directives above, where a value is defined, an expression can be
entered.

The following formats for constant numbers are supported (note these are
illustrated as a regular expression):

"x" or 'x'::
    A single quoted character will be converted into the appropriate character
    code.

[1-9][0-9]*::
    A decimal number, e.g. 42.

0[0-7]*::
    An octal number, e.g. 052.

0x[0-9a-fA-f]+::
    A hex number, e.g. 0x2a.

[0-9a-fA-f]+h::
    A hex number, e.g. 2ah.

$[0-9a-fA-f]+::
    A hex number, e.g. $2a.

[01]+b::
    A binary number, e.g. 00101010b

[a-zA-Z_0-9]+::
    A label, e.g. +main_loop+.

The following operators are understood.  The order here is the order of
precedence.

{ }::
    Brackets used to alter the order of precedence.  Note normal parenthesis
    aren't used as the assembly language may make use of them.

~ + -::
    Bitwise NOT/unary plus/unary minus.

<< >>::
    Shift left/shift right.

/ * %::
    Division/multiplication/modulus.

+ -::
    Addition/subtraction.

All the following have the same precedence, and so will be done left to right.

==::
    Equality.  Returns 1 if the arguments are equal, otherwise zero.

!=::
    Inequality.  Returns 1 if the arguments are unequal, otherwise zero.

< \<= > >=::
    Less than/less than or equal/greater than/greater than or equal.  Returns 1
    if the arguments are equal, otherwise zero.


All the following have the same precedence, and so will be done left to right.

&& &::
    Boolean/bitwise AND.  For boolean operation arguments, zero is FALSE,
    otherwise TRUE.

|| |::
    Boolean/bitwise OR.

^::
    Bitwise XOR.


Assembly instructions will also permit these expressions to be used where
applicable.  As many opcodes use parenthesis to indicate addressing modes,
remember that {} brackets can be used to alter expression precedence.

----
    ld  a,{8+2}*2               ; On the Z80 loads A with the value 20
    ld  a,({8+2}*2)             ; On the Z80 loads A with the value stored at
                                ; address 20
----

Note that the expression is evaluated using a standard C int, and then cast
to the appropriate size.


Character Sets
~~~~~~~~~~~~~~

The assembler has built-in support for a few different character sets.
These can be set by using the options _charset_ or _codepage_, i.e.

----
    option codepage, <format>
    option charset, <format>
----

The following values can be used for _format_.

ascii::
    7-bit ASCII.  This is the default.

spectrum::
    The character codes as used on the Sinclair ZX Spectrum.

zx81::
    The character codes as used on the Sinclair ZX-81.  Lower case
    letters are encoded as normal upper case letters and upper case
    letter will be encoded as inverse upper case letters.

cbm::
    PETSCII as used on the Commodore Business Machine's range from the
    PET to the C128.  See https://en.wikipedia.org/wiki/PETSCII for
    more details.

e.g.

----
    option  +list
    option  +list-hex

    option  charset,ascii
    db      "Hello",'A'
; $48 $65 $6C $6C $6F $41

    option  charset,zx81
    db      "Hello",'A'
; $AD $2A $31 $31 $34 $A6

    option  codepage,cbm
    db      "Hello",'A'
; $48 $45 $4C $4C $4F $41

    option  codepage,spectrum
    db      "Hello",'A'
; $48 $65 $6C $6C $6F $41

----


Macros
~~~~~~

Macros can be defined in one of two ways; either parameterless or with named
parameters.  Macro names are case-insensitive.  In the parameterless mode the
special identifier '*' can be used to expand all arguments, which will be
separated with commas.

----
macro1: macro

        ld a,\1
        ld b,\2
        call \3
        defb \*

        endm

macro2: macro char,junk,interface

        ld a,@char
        ld b,@junk
        call @interface

        endm
----

Note that trying to expand and unknown/missing argument will be replaced with
an empty string.  Also the two argument reference styles can be mixed, though
obviously the @ form only makes sense in a parameterised macro, e.g.

----

mac:    macro char,junk,interface

        ld a,@char
        ld b,\2
        call @interface

        endm
----

The at symbol (@) used for parameter expansion in named argument macros can
be replaced by using the following option, e.g.

----
        option  macro-arg-char,&
----

Note that this is enforced when the macro is *used*, not when it is *defined*.
Also the character must not be quoted, as that will be parsed as a string
holding the character code of the character.


Output Format
-------------

By default the assembled code is written to a file called *output* as raw
binary covering the block of memory that the assembly touched.  If memory
banks have been used then *output* is appended with the memory bank number, so
that a separate output file is generated for each bank.

This can be controlled with the following options.

option output-file, _file_::
    Send the output to _file_.  If memory banks have been used then files are
    generated with the names _file_.0, _file_.1, and so on.

option output-type, _format_::
    Controls the output format with the following settings

        raw;;
            The default raw binary.

        spectrum;;
            Generates a Spectrum TAP file for an emulator.  The TAP file will
            be given the same name as the output filename, and its load address
            will be set to the start of the created memory.  Remember that TAP
            files can be concatenated, so the output could be appended to
            another TAP file containing a BASIC loader for example.  Note that
            if memory banks have been used then each bank is output to the TAP
            file as separate code blocks.


Listing
-------

By default no output listing is generated.  This can be controlled by the
the following options.

option list, <on|off>::
    Enables/disables listing.  The listing will go to stdout.

option list-file, _file_::
    Sends the listing to _file_.  Note this should appear before enabling the
    listing.

option list-pc, <on|off>::
    Control the output of the current PC in the as a comment preceding the
    line (so that a listing could be reassembled with no editing).  Defaults
    to *off*.

option list-hex, <on|off>::
    Control the output of the bytes generated by the source line in hex.
    Defaults to *off*.  If *on* then the hex is output in a comment preceding
    the line (possibly with the PC above), so that a listing is still valid to
    be assembled.

option list-labels, <on|off|all>::
    Controls the listing of labels, either *off* (the default), *on* to dump
    label values at the end of the listing and *all* to dump all labels,
    including internally generated private labels for macros.

option list-macros, <off|exec|dump|all>::
    Controls the listing of macro invocations, either

    off;;
        The default; don't list anything.
    exec;;
        List invocations of macros.
    dump;;
        Produce a list of macro definitions at the end of the listing.
    all;;
        Combine "exec" and "dump"

option list-rm-blanks, <on|off>::
    Defaults to *on*.  This option causes multiple blank lines to be collapsed
    down to a single line.


Z80 CPU
-------

Opcodes
~~~~~~~

The Z80 assembler uses the standard Zilog opcodes, and supports
undocumented instructions.

For instructions were the Accumulator can be assumed it can be omitted, and
EOR can be used the same as XOR:

----
    xor     a,a         ; These are equivalent
    xor     a
    eor     a,a

    and     a,b         ; These are equivalent
    and     b
----

For exchange opcodes with parameters the parameters can be reversed from their
official form:

----
    ; The official forms
    ;
    ex      de,hl
    ex      af,af'
    ex      (sp),hl
    ex      (sp),ix
    ex      (sp),iy

    ; Also supported
    ;
    ex      hl,de
    ex      af',af
    ex      hl,(sp)
    ex      ix,(sp)
    ex      iy,(sp)
----

Where the high/low register parts of the IX and IY registers are to be used,
simply use ixl, iyl, ixh and iyh.  Note that the assembler will accept
illegal pairings involving H and L, but these will be warned about:

----

    ld  ixh,$e5
    ld  iyl,iyl

    ld  ixh,l           ; This will be turned into "ld ixh,ixl" and a
                        ; warning will be issued.

    ld  iyh,ixl         ; This will generate an error as the index registers
                        ; have been mixed.

----

For bit manipulations that also can copied to a register, these can be
represented by adding the destination register as an extra parameter, e.g.

----

    srl (iy-1),d
    set 3,(iy-1),a
    res 4,(iy-1),b

----

For the hidden IN instruction using the flag register the following are all
equivalent:

----
    in  (c)
    in  f,(c)
----

For the hidden OUT instruction using the flag register, $00 or $ff depending
on where you're reading, the following are all equivalent, where _value_ can
be any value at all:

----
    out (c)
    out (c),f
    out (c),<value>
----


Options
~~~~~~~

The Z80 assembler has no options.


6502 CPU
--------

Opcodes
~~~~~~~

The 6502 assembler uses the standard Motorola opcodes.


Options
~~~~~~~

The 6502 assembler has the following options.

option zero-page, <on|off|auto>::
        Use Zero-Page addressing for _absolute_ and _absolute_,X address modes.
        If mode is set to *auto* then tries to calculate the mode based on the
        value in the last pass.
        Defaults to *off*.  e.g.

            cpu     6502
            org     $8000

            lda     $0000,x     ; Produces $bd $00 $00
            option  +zero-page
            lda     $0000,x     ; Produces $b5 $00
            lda     $1234,x     ; Produces an error

            option  zero-page,auto
            lda     $00,x       ; Produces $b5 $00
            lda     $8000,x     ; Produces $bd $00 $80



// vim: ai sw=4 ts=8 expandtab spell