Note: This article was originally written by Jonathan Cauldwell and is reproduced here with permission.
Until now we have drawn all our graphics directly onto the screen, for reasons of speed and simplicity. However, there is one major disadvantage to this method: if the television scan line happens to be covering the particular screen area where we are deleting or redrawing our image then our graphics will appear to flicker. Unfortunately, on the Spectrum there is no easy way to tell where the scan line is at any given point so we have to find a way around this.
One method which works well is to delete and redraw all sprites immediately following a halt instruction, before the scan has a chance to catch up with the image being drawn. The disadvantage to this method is that our sprite code has to be pretty fast, and even then it is not advisable to delete and re-draw more than two sprites per frame because by then the scan will be over the top border and into the screen area. Of course, locating the status panel at the top of the screen might give a little more time to draw our graphics, and if the game is to run at 25 frames per second we could employ a second halt instruction and manoeuvre another couple of sprites immediately afterwards.
Ultimately, there comes a point where this breaks down. If our graphics are going to take a little longer to draw we need another way to hide the process from the player and we need to employ the use of a second buffer screen. This means that all the work involved in drawing and undrawing graphics is hidden from the player and all that is visible is each finished frame once it has been drawn.
There are two ways of doing this on a Spectrum. One method will only work on a 128K machine, so we will put that to one side for the time being. The other method actually tends to be more complicated in practice but will work on any Spectrum.
Creating a Screen Buffer
The simplest way to implement double buffering on a 48K Spectrum is to set up a dummy screen elsewhere in RAM, and draw all our background graphics and sprites there. As soon as our screen is complete we copy this dummy screen to the physical screen at address 16384 thus:
. ; code to draw all our sprites etc. . . . . ; now screen is drawn copy it to physical screen. ld hl,49152 ld de,16384 ld bc,6912 ldir
While in theory this is perfect, in practice copying 6912 bytes of RAM (or 6144 bytes if we ignore the colour attributes) to the screen display every frame it is too slow for arcade games. The secret is to reduce the amount of screen RAM we need to copy each frame, and to find a faster way than by transferring it with the LDIR instruction.
The first way is to decide how big our screen is going to be. Most games separate the screen into 2 areas: a status panel to display score, lives and other bits of information, and a window where all the action takes place. As we don’t need to update the status panel every frame our dummy screen only needs to be as big as the action window.
So if we were to have a status panel as an 80 x 192 pixel at the right edge of the screen that would leave us a 176×192 pixel window, meaning our dummy screen would only need to be 22 chars wide by 192 pixels high, or 22×192=4224 bytes. Manually moving 4224 bytes from one part of RAM to another is far less painful than manipulating 6114 bytes. The trick is to find a size which is large enough not to restrict gameplay while being small enough to be manipulated quickly. Of course, we may also want to make our buffer a little larger around the edges. While these edges are not displayed on the screen they are useful if we wish to clip sprites as they move into the action window from the sides.
Once we have set our buffer size in stone we need to write a routine to transfer it to the physical display file one or two bytes at a time. While we are at it, we can also re-order our buffer screen to use a more logical display method than the one used by the physical screen. We can make allowances for the peculiar ordering of the Spectrum’s display file in our transfer rountine, meaning any graphics routines which make use of our dummy screen buffer can be simplified.
There are two really quick ways of moving a dummy screen to the display screen. The first, and most simple method, is to use lots of unrolled LDI instructions. The second, and more complicated method, makes use of PUSH and POP to transfer the data.
Let us start with LDI. If our buffer is 22 chars wide we might transfer a single line from the buffer to the screen display with 22 consecutive LDI instructions – it is much quicker to use lots of LDI instructions than to use a single LDIR. We could write a routine to transfer our data across a single line at a time, pointing HL to the start of each line of the buffer, DE to the line on the screen where it is needed, and then 22 LDI instructions to move the data across. However, as each LDI instruction takes two bytes of code, it stands to reason that such a routine would be at least twice the size of the buffer it moved. A considerable hit when dealing with a little over 40K of useful RAM. You may instead wish to move the LDI instructions to a subroutine which copies a pixel line, or perhaps a group of 8 pixel lines, at a time. This routine could then be called from within a loop – unrolled or not – which could take care of the HL and DE registers.
The second method is to transfer the buffer to the screen using PUSH and POP instructions. While this does have the advantage of being the fastest way there is, there are drawbacks. You do need complete control of the stack pointer so you can’t have any interrupts occurring mid-way through the routine. The stack pointer must be stored away somewhere first, and restored immediately afterwards.
The Spectrum’s stack is usually located below your program code, but this method involves setting the stack to point to each part of the buffer in turn, and then using POP to copy the contents of the dummy screen buffer into each of the register pairs in turn. The stack pointer is then moved to the relevant point in the screen display RAM, before the registers are PUSHed into memory in the reverse order to that in which they were POPped. Ie, values are POPped from the buffer going from the start of each line, and PUSHed to the screen in the reverse order, going from the end of the line to the beginning.
Below is the gist of the screen transfer routine from Rallybug. This used a buffer 30 characters wide, with 28 characters visible on screen. The remaining 2 characters were not displayed so that sprites moved onto the screen slowly from the edge, rather than suddenly appearing from nowhere. As the visible screen width is 28 characters wide, this requires 14 16-bit registers per line. Obviously, the Z80A doesn’t have this many, even counting the alternate registers and IX and IY. As such, the Rallybug routine splits the display into two halves of 14 bytes each, requiring just 7 register pairs. The routine sets the stack pointer to the beginning of each buffer line in turn, then POPs the data into AF, BC, DE and HL. It then swaps these registers into the alternate register set with EXX, and POPs 6 more bytes into BC, DE and HL. These registers now need to be unloaded into the screen area, so the stack pointer is set to point to the end of the relevant screen line, and HL, DE and BC are PUSHed into position. The alternate registers are then restored, and HL, DE, BC and AF are respectively copied into position. This is repeated over and over again for each half of each screen line, before the stack pointer is restored to its original position.
Complicated, yes. But incredibly fast.
SEG1 equ 16514 SEG2 equ 18434 SEG3 equ 20482 P0 equ 0 P1 equ 256 P2 equ 512 P3 equ 768 P4 equ 1024 P5 equ 1280 P6 equ 1536 P7 equ 1792 C0 equ 0 C1 equ 32 C2 equ 64 C3 equ 96 C4 equ 128 C5 equ 160 C6 equ 192 C7 equ 224 xfer ld (stptr),sp ; store stack pointer. ; Character line 0. ld sp,WINDOW ; start of buffer line. pop af pop bc pop de pop hl exx pop bc pop de pop hl ld sp,SEG1+C0+P0+14 ; end of screen line. push hl push de push bc exx push hl push de push bc push af . . ld sp,WINDOW+4784 ; start of buffer line. pop af pop bc pop de pop hl exx pop bc pop de pop hl ld sp,SEG3+C7+P7+28 ; end of screen line. push hl push de push bc exx push hl push de push bc push af okay ld sp,(stptr) ; restore stack pointer. ret
Scrolling the Buffer
Now we have our dummy screen, we can do anything we like to it without the risk of flicker or other graphical anomalies, because we only transfer the buffer to the physical screen when we have finished building the picture. We can place sprites, masked or otherwise, anywhere we like and in any order we like. We can move the screen around, and animate the background graphics, and most importantly, we can now scroll in any direction.
Different techniques are required for different types of scrolling, although they all have one thing in common: as scrolling is a processor-intensive task, unrolled loops are the order of the day. The simplest type of scroll is a left/right single pixel scroll. A right single pixel scroll requires us to set the HL register pair to the start of the buffer, then execute the following two operands over and over again until we reach the end of the buffer:
rr (hl) ; rotate carry flag and 8 bits right. inc hl ; next buffer address.
Similarly, to execute a left single-pixel scroll we set hl to the last byte of the buffer and execute these two instructions until we reach the beginning of the buffer:
rl (hl) ; rotate carry flag and 8 bits right. dec hl ; next buffer address.
For most of the time, however, we can get away with only incrementing or decrementing the l register, instead of the HL pair, speeding up the routine even more. This does have the drawback of having to know exactly when the high order byte of the address changes. For this reason, I usually set my buffer address in stone right at the beginning of the project, often at the very top of RAM, so I don’t have to rewrite the scrolling routines when things get shifted around during the course of a project. As with the routine to transfer the buffer to the physical screen, a massive unrolled loop is very expensive in terms of RAM, so it is a good idea to write a smaller unrolled loop which scrolls, say, 256 bytes at a time, then call it 20 or so times, depending upon the chosen buffer size.
In addition to scrolling one pixel at a time, we can scroll four pixels fairly quickly too. By replacing rl (hl) with rld in the left scroll, and rr (hl) with rrd in the right scroll, we can move 4-pixels.
Vertical scrolling is done by shifting bytes around in RAM, in much the same way as the routine to transfer the dummy screen to the physical one. To scroll up one pixel, we set our FROM address to be the start of the second pixel line, the TO address to the address of the start of the buffer, then copy the data from the FROM address to the TO address until we reach the end of the buffer. To scroll down, we have to work in the opposite direction, so we set our FROM address to the end of the penultimate line of the buffer, our TO address to the end of the last line, and work backwards until we reach the start of the buffer. The added advantage of vertical scrolling is that we can scroll up or down by more than one line, simply by altering the addresses, and the routine will run just as quickly. Generally speaking, it isn’t a good idea to scroll by more than one pixel if your frame rate is lower than 25 frames per second, because the screen will appear to judder.
There is one other technique that can be employed with vertical scrolling, and it is one I employed when writing Megablast for Your Sinclair. This involves treating the dummy screen as wrap-around. In other words, you still use the same amount of RAM for the dummy buffer, but the part of the buffer from which you start copying to the top of the screen can change from one frame to the next. When you reach the end of the buffer, you skip back to the beginning. With this system, the routine to copy the buffer takes the address of the start of the buffer from a 16-bit pointer which could point to any line in the buffer, and copies the data to the physical screen line by line until it reaches the end of the buffer. At this point, the routine copies the data from the start of the buffer to the remainder of the physical screen. This makes the transfer routine a little slower, and complicates any other graphics routines – which also have to go back to the first line whenever they go beyond the last line in the buffer. It does, on the other hand, mean that no data needs to be shifted in order to scroll the screen. By changing the 16-bit pointer to the line which is first copied to the physical screen, scrolling is done automatically when the buffer is transferred.
Simple Text and Graphics
Note: This article was originally written by Jonathan Cauldwell and is reproduced here with permission.
So you’ve read the Z80 documentation, you know how the instructions affect the registers and now you want to put this knowledge to use. Judging by the number of emails I have received asking how to read the keyboard, calculate screen addresses or emit white noise from the beeper it has become clear that there really isn’t much in the way of resources for the new Spectrum programmer. This document, I hope, will grow to fill this void in due course. In its present state it is clearly years from completion, but in publishing the few basic chapters that exist to date I hope it will be of help to other programmers.
The ZX Spectrum was launched in April 1982, and by today’s standards is a primitive machine. In the United Kingdom and a few other countries it was the most popular games machine of the 1980s, and through the joys of emulation many people are enjoying a nostalgic trip back in time with the games of their childhoods. Others are only now discovering the machine for the first time, and some are even taking up the challenge of writing games for this simple little computer. After all, if you can write a decent machine code game for a 1980s computer there probably isn’t much you couldn’t write.
Purists will hate this document, but writing a game isn’t about writing “perfect” Z80 code – as if there were such a thing. A Spectrum game is a substantial undertaking, and you won’t get around to finishing it if you are too obsessed with writing the very best scoring or keyboard reading algorithms. Once you’ve written a routine that works and doesn’t cause problems elsewhere, move on to the next routine. It doesn’t matter if it’s a little messy or inefficient, because the important part is to get the gameplay right. Nobody in his right mind is going to disassemble your code and pick faults with it.
The chapters in this document have been ordered in a way designed to enable the reader to start writing a simple game as soon as possible. Nothing beats the thrill of writing your first full machine-code game, and I have set out this manual in such a way as to cover the very basic minimum requirements for this in the first few chapters. From there we move on to cover more advanced methods which should enable the reader to improve the quality of games he is capable of writing.
Throughout this document a number of assumptions have been made. For a start, it is assumed that the reader is familiar with most Z80 opcodes and what they do. If not there are plenty of guides around which will explain these far better than I could ever do. Learning machine code instructions isn’t difficult, but knowing how to put them together in meaningful ways can be. Familiarity with the load (ld), compare (cp), and conditional jump (jp z / jp c / jp nc) instructions is a good place to start. The rest will fall into place once these are learned.
These days we have the benefit of more sophisticated hardware, and there is no need to develop software on the machine for which it is intended. There are plenty of adequate cross-assemblers around which will allow Spectrum software to be developed on a PC and the binary file produced can then be imported into an emulator – SPIN is a popular emulator which has support for this feature.
For graphics there’s a tool called SevenUp which I use, and can thoroughly recommend. This can convert bitmaps into Spectrum images, and allows the programmer to specify the order in which sprites or other graphics are sorted. Output can be in the form of a binary image, or source code. Another popular program is TommyGun.
Music wise I’d recommend the SoundTracker utility which can be downloaded from the World of Spectrum archives. There’s a separate compiler program you’ll also need. Bear in mind that these are Spectrum programs, not PC tools and need to be run on an emulator.
As editors and cross-compilers go I am not in a position to recommend the best available, because I use an archaic editor and Z80 Macro cross-assembler written in 1985, running in DOS windows. Neither are tools I would recommend to others. If you require advice on which tools might be suitable for you, I suggest you try the World of Spectrum development forums. This friendly community has a wide range of experience and is always willing to help.
Over the many years that I have been writing Spectrum software a number of habits have formed which may seem odd. The way I order my coordinates, for example, does not follow the conventions of mathematics. My machine code programs follow the Sinclair BASIC convention of PRINT AT x,y; where x refers to the number of character cells or pixels from the top of the screen and y is the number of characters or pixels from the left edge. If this seems confusing at first I apologise, but it always seemed a more logical way of ordering things and it just stuck with me. Some of my methodology may seem unusual in places, so where you can devise a better way of doing something by all means go with that instead.
One other thing: commenting your code as you go along is important, if not essential. It can be hellishly difficult trying to find a bug in an uncommented routine you wrote only a few weeks ago. It may seem tedious to have to document every subroutine you write, but it will save development time in the long run. In addition, should you wish to re-use a routine in another game at some point in the future, it will be very easy to rip out the required section and adapt it for your next project.
Other than that, just have fun. If you have any suggestions to make or errors to report, please get in touch.
Jonathan Cauldwell, January 2007.
The first BASIC program that most novice programmers write is usually along these lines:
10 PRINT "Hello World" 20 GOTO 10
Alright, so the text may differ. Your first effort may have said “Dave is ace” or “Rob woz ere”, but let’s face it, displaying text and graphics on screen is probably the most important aspect of writing any computer game and – with the exception of pinball or fruit machines – it is practically impossible to conceive a game without a display. With this in mind let us begin this tutorial with some important display routines in the Spectrum ROM.
So how would we go about converting the above BASIC program to machine code? Well, we can PRINT by using the RST 16 instruction – effectively the same as PRINT CHR$ a – but that merely prints the character held in the accumulator to the current channel. To print a string on screen, we need to call two routines – one to open the upper screen for printing (channel 2), then the second to print the string. The routine at ROM address 5633 will open the channel number we pass in the accumulator, and 8252 will print a string beginning at de with length bc to this channel. Once channel 2 is opened, all printing is sent to the upper screen until we call 5633 with another value to send output elsewhere. Other interesting channels are 1 for the lower screen (like PRINT #1 in BASIC, and we can use this to display on the bottom two lines) and 3 for the ZX Printer.
ld a,2 ; upper screen call 5633 ; open channel loop ld de,string ; address of string ld bc,eostr-string ; length of string to print call 8252 ; print our string jp loop ; repeat until screen is full string defb '(your name) is cool' eostr equ $
Running this listing fills the screen with the text until the scroll? prompt is displayed at the bottom. You will note however, that instead of each line of text appearing on a line of its own as in the BASIC listing, the beginning of each string follows directly on from the end of the previous one which is not exactly what we wanted. To achieve this we need to throw a line ourselves using an ASCII control code. One way of doing this would be to load the accumulator with the code for a new line (13), then use RST 16 to print this code. Another more efficient way is to add this ASCII code to the end of our string thus:
string defb '(your name) is cool' defb 13 eostr equ $
There are a number of ASCII control codes like this which alter the current printing position, colours etc. and experimentation will help you to decide which ones you yourself will find most useful. Here are the main ones I use:
13 NEWLINE sets print position to the beginning of the next line.
16,c INK Sets ink colour to the value of the following byte.
17,c PAPER Sets ink colour to the value of the following byte.
22,x,y AT Sets print x and y coordinates to the values specified in the following two bytes.
Code 22 is particularly handy for setting the coordinates at which a string or graphic character is to be displayed. This example will display an exclamation mark in the bottom right of the screen:
ld a,2 ; upper screen call 5633 ; open channel ld de,string ; address of string ld bc,eostr-string ; length of string to print call 8252 ; print our string ret string defb 22,21,31,'!' eostr equ $
This program goes one step further and animates an asterisk from the bottom to the top of the screen:
ld a,2 ; 2 = upper screen. call 5633 ; open channel. ld a,21 ; row 21 = bottom of screen. ld (xcoord),a ; set initial x coordinate. loop call setxy ; set up our x/y coords. ld a,'*' ; want an asterisk here. rst 16 ; display it. call delay ; want a delay. call setxy ; set up our x/y coords. ld a,32 ; ASCII code for space. rst 16 ; delete old asterisk. call setxy ; set up our x/y coords. ld hl,xcoord ; vertical position. dec (hl) ; move it up one line. ld a,(xcoord) ; where is it now? cp 255 ; past top of screen yet? jr nz,loop ; no, carry on. ret delay ld b,10 ; length of delay. delay0 halt ; wait for an interrupt. djnz delay0 ; loop. ret ; return. setxy ld a,22 ; ASCII control code for AT. rst 16 ; print it. ld a,(xcoord) ; vertical position. rst 16 ; print it. ld a,(ycoord) ; y coordinate. rst 16 ; print it. ret xcoord defb 0 ycoord defb 15
Printing Simple Graphics
Moving asterisks around the screen is all very fine but for even the simplest game we really need to display graphics. Advanced graphics are discussed in later chapters, for now we will only be using simple Space Invader type graphics, and as any BASIC programmer will tell you, the Spectrum has a very simple mechanism for this – the User Defined Graphic, usually abbreviated to UDG.
The Spectrum’s ASCII table contains 21 (19 in 128k mode) user-defined graphics characters, beginning at code 144 and going on up to 164 (162 in 128k mode). In BASIC UDGs are defined by poking data into the UDG area at the top of RAM, but in machine code it makes more sense to change the system variable which points to the memory location at which the UDGs are stored, which is done by changing the two-byte value at address 23675.
We can now modify our moving asterisk program to display a graphic instead with a few changes which are underlined.
ld hl,udgs ; UDGs. ld (23675),hl ; set up UDG system variable. ld a,2 ; 2 = upper screen. call 5633 ; open channel. ld a,21 ; row 21 = bottom of screen. ld (xcoord),a ; set initial x coordinate. loop call setxy ; set up our x/y coords. ld a,144 ; show UDG instead of asterisk. rst 16 ; display it. call delay ; want a delay. call setxy ; set up our x/y coords. ld a,32 ; ASCII code for space. rst 16 ; delete old asterisk. call setxy ; set up our x/y coords. ld hl,xcoord ; vertical position. dec (hl) ; move it up one line. ld a,(xcoord) ; where is it now? cp 255 ; past top of screen yet? jr nz,loop ; no, carry on. ret delay ld b,10 ; length of delay. delay0 halt ; wait for an interrupt. djnz delay0 ; loop. ret ; return. setxy ld a,22 ; ASCII control code for AT. rst 16 ; print it. ld a,(xcoord) ; vertical position. rst 16 ; print it. ld a,(ycoord) ; y coordinate. rst 16 ; print it. ret xcoord defb 0 ycoord defb 15 udgs defb 60,126,219,153 defb 255,255,219,219
As Rolf Harris used to say: “Can you tell what it is yet?”
Of course, there’s no reason why you couldn’t use more than the 21 UDGs if you wished. Simply set up a number of banks of them in memory and point to each one as you need it.
Alternatively, you could redefine the character set instead. This gives a larger range of ASCII characters from 32 (SPACE) to 127 (the copyright symbol). You could even mix text and graphics, redefining the letters and numbers of your font to the style of your choice, then using up the symbols and lowercase letters for aliens, zombies or whatever your game requires. To point to another set we subtract 256 from the address at which the font is placed and place this in the two byte system variable at address 23606. The default Sinclair font for example is located at ROM address 15616, so the system variable at address 23606 points to 15360 when the Spectrum is first switched on.
This code copies the Sinclair ROM font to RAM making it “bolder” as it goes, then sets the system variable to point to it:
ld hl,15616 ; ROM font. ld de,60000 ; address of our font. ld bc,768 ; 96 chars * 8 rows to alter. font1 ld a,(hl) ; get bitmap. rlca ; rotate it left. or (hl) ; combine 2 images. ld (de),a ; write to new font. inc hl ; next byte of old. inc de ; next byte of new. dec bc ; decrement counter. ld a,b ; high byte. or c ; combine with low byte. jr nz,font1 ; repeat until bc=zero. ld hl,60000-256 ; font minus 32*8. ld (23606),hl ; point to new font. ret
For most games it is better to define the player’s score as a string of ASCII digits, although that does mean more work in the scoring routines and makes high score tables a real pain in the backside for an inexperienced assembly language programmer. We will cover this in a later chapter, but for now we’ll use some handy ROM routines to print numbers for us.
There are two ways of printing a number on the screen, the first of which is to make use of the same routine that the ROM uses to print Sinclair BASIC line numbers. For this we simply load the bc register pair with the number we wish to print, then call 6683:
ld bc,(score) call 6683
However, since BASIC line numbers can go only as high as 9999, this has the disadvantage of only being capable of displaying a four digit number. Once the player’s score reaches 10000 other ASCII characters are displayed in place of numbers. Fortunately, there is another method which goes much higher. Instead of calling the line number display routine we can call the routine to place the contents of the bc registers on the calculator stack, then another routine which displays the number at the top of this stack. Don’t worry about what the calculator stack is and what its function is because it’s of little use to an arcade games programmer, but where we can make use of it we will. Just remember that the following three lines will display a number from 0 to 65535 inclusive:
ld bc,(score) call 11563 ; stack number in bc. call 11747 ; display top of calc. stack.
To set the permanent ink, paper, brightness and flash levels we can write directly to the system variable at 23693, then clear the screen with a call to the ROM:
; We want a yellow screen. ld a,49 ; blue ink (1) on yellow paper (6*8). ld (23693),a ; set our screen colours. call 3503 ; clear the screen.
The quickest and simplest way to set the border colour is to write to port 254. The 3 least significant bits of the byte we send determine the colour, so to set the border to red:
ld a,2 ; 2 is the code for red. out (254),a ; write to port 254.
Port 254 also drives the speaker and Mic socket in bits 3 and 4. However, the border effect will only last until your next call to the beeper sound routine in the ROM (more on that later), so a more permanent solution is required. To do this, we simply need to load the accumulator with the colour required and call the ROM routine at 8859. This will change the colour and set the BORDCR system variable (located at address 23624) accordingly. To set a permanent red border we can do this:
ld a,2 ; 2 is the code for red. call 8859 ; set border colour.