Assembly in Basic II – Simple utility function

As we all know Assembly programs perform most functions much faster than Basic programs due to its compiled nature vs. interpreted Basic and possibility to optimize code closer to actual hardware.
On the other hand coding in Assembly can be much more time consuming and with complex applications can become very long and complicated to manage.
So in our journey into Commander X16 programming we will jump into it only halfway. Basic is fast enough for many types of programs, even games. However there are functionalities that are impossible to write in Basic in such a way to execute fast enough for smooth gameplay So for our next challenge we will write one pretty simple function in Assembly and learn how to compile it, load it by Basic program and call it from Basic in order to improve overall performance of our program.
An area where Basic is lacking the most and coincidently also pretty easy to demonstrate is almost anything related to display. In fact one of the first issues when writing a game in Basic is how to fill the large areas of screen with some pattern or even just clearing it. So I decided to write a simple RectFill function to demonstrate such Basic-Assembly hybrid program.

RectFill Function

The requirements for the function are pretty simple. We should be able to draw rectangle filled with any PETSCII character and any Color Attribute by also defining starting x and y position and defining width and height. The same function can therefore be used to clear the screen without using PRINT CHR$(147), at the same we can change the color of the foreground and background of the screen but we can also change the patterns in just parts of the screen or even just draw character based vertical or horizontal lines and stripes of different colors.
To achieve that we will look at how to pass parameters to Assembly function from Basic, how to load Assembly program and how to call it. Oh and of course we will also write the function and compile it using Visual Studio Code editor and Retro Compiler we set up in Part I.
It would also be very useful to read the blog post Direct Vera Access where we describe how to access video memory from Assembly (or Basic) directly.

Passing parameters

The simplest and easiest way to pass any kind of data from Basic to Assembly and back is to use the memory directly. As we know we can use POKE command in Basic to write to any address the CPU has access to and with PEEK we can ready from any address. Obviously also Assembly programs have the same reach. Of course not all memory is created equal. We will not go into details about the memory Map and how to utilize banking and similar advanced features. However there is one area of the memory that is especially interesting. 6502 family of processors uses so called Zero page memory for some very convenient and fast addressing modes. Zero page means addresses from $00 to $FF hex or in other words these are addresses that can be reached by using single byte as an address. That is much faster to load into CPU and therefore faster to execute. In our example that is not really that important but I thought it is a good place to mention it. You are of course free to use some other memory location to pass parameters by modifying the sample code.
Based on current documentation the area that is not used by Kernel or any other system function and therefore free for us to use is from $02 to $7F. SO let’s decide on following addresses:

Address Value
$02 X of the start point
$03 Y of the start point
$04 Width of the rectangular area
$05 Height of the rectangular area
$06 Character code to be written into Video memory
$07 Attribute value to be written into Video memory

With that decided it is very simple to pass parameters from Basic. We use POKE commands to set everything up for example for drawing a square checkered area in the center of the screen in the Default Screen Mode $02, starting location in 20,10 of width 40 and height 40 characters filled with checkered character in White ink on Black background:

POKE $02,20
POKE $03,10
POKE $04,40
POKE $05,40
POKE $06,102
POKE $07,$01

Assembly Code

Let’s write the code that will read these parameters and write to VERA registers that will result in changes in Video Memory. For easier readability let’s first define some Labels and assign them constant values:

Listing 1 = Labels

First group of labels are used for addressing VERA registers. Three registers for 20 bit long addresses. Then in line 4 we have location of Data 0 register and finally also the VERA Control register.
In the second group of labels we define the memory locations in Zero Page where we will read parameters from. Those are same locations we discussed above and already prepared some POKEs to write proper values in.
Before starting to write the code we have to talk some more about memory map in order to decide where to put our Assembly program. In the official Commander X16 documentation we see the following map:

Adresses Description
$0000-$007F User zero page
$0080-$00FF KERNAL and BASIC zero page variables
$0100-$01FF CPU stack
$0200-$03FF KERNAL and BASIC variables, vectors
$0400-$07FF Available for machine code programs or custom data storage
$0800-$9EFF BASIC program/variables; available to the user

As we see there is a section perfect for our needs. We obviously want to leave the Basic memory free for the basic part of our program and we definitely don’t want to mess with any of the system programs like Kernal or Basic and Zero Page is to be used sparingly. That leaves the area from $0400 to $07FF. That is 1024 bytes of space that is more than enough for such a simple function and is safe from Basic programs to overwrite it.
So we will use the following syntax to tell our compiler that we want the program to start at $0400:

* = $0400

Let’s look at the whole Assembly program now:

Before we go into details let’s make quick glance at the structure. As we see we have two labels one for Columns (col) and the other for Rows (row). So clearly we have (as expected) double loop to draw the filled rectangle. Inner loop draws required number of characters horizontally as defined in Width parameter and outer loop makes sure we draw required number of rows as set by Height parameter.
Now let’s analyze the code line by line…
Line 19 simply sets the VERA control register to 0 meaning we don’t want to reset it and we want to use Data Register 0 for data transfer to Video Memory.
To write the characters to Video memory we have to calculate starting address to it. Remember how in Basic we used formula to use with VPOKE Y*256+X*2, the same thing has to be done in Assembly. The only difference is that in Assembly we have to deal with each byte separately since we can’t manipulate 16 bit values directly. Therefore we reserved two bytes in Zero Page memory where we will store calculated 16 bit value inside Video memory.
In lines 21-23 we calculate Low byte of the address. We transfer the X parameter value into Accumulator register, multiply by two (like in basic) and store it into Low Byte address. Multiplication by two is done in line 21 by shifting bits by one position to left. Because we know that maximum X can only be 79 so highest bit is always 0 we can use Rotate Left (rol) instruction. Alternatively we could use Arithmetic Shift Left (asl) but we would need another instruction to clear Carry flag before and would therefore spend few CPU cycles more.
In lines 24 and 25 we “calculate” the High address byte. As you see there is really no calculation required because of how Screen mode 0 and 2 are setup – each line uses 256 bytes so by writing Y to High byte we “automatically” multiply the address by 256, which is very convenient indeed.
We will have to keep track of counting rows and columns so we load Width into register X and because we only have three registers and will need two later we will store Height into Zero Page location $0A (labeled as Ycounter) in lines 27-29.
In lines 31-32 we tell VERA chip that we will be using increment 1 when writing to the Data transfer register. It means that the Video memory pointer will be incremented by one after every write (or read if we would be reading from it). This is because we decided that we will be writing both the character code and attribute. We could easily modify this function to just change the character or just attribute in that case the appropriate value for increment would be two.
Next we start the first loop. In lines 34-37 we transfer the starting memory address for the first character for the row we write to. In the beginning it is of course first row but that will be later incremented for each subsequent rows.
Next we have to load the values we will be writing to screen. We load character code into Accumulator and the Attribute value into Index Y register. We do that in lines 39-40.
In lines 42-45 we do the inner loop. We simply write character code and attribute to VERA data register 0 and let VERA increment the address automatically. We just need to make sure we do correct number of cycles and we do that decrementing Index X register and returning to the beginning of the inner loop until it becomes 0.
The remaining code in lines 47-50 is to prepare for the next row. We increment memory location by 256 by incrementing High byte by 1. Load Index X register with fresh Width value and decrement height counter and return to the beginning of outer loop if we haven’t reached 0.
Line 52 returns control back to Basic program.

Compiling the code

If you set up the environment as described in previous post the compiling should be pretty straightforward. There is one more thing we didn’t talk about. The Retro Assembler is able to compile Assembly for several types of CPUs including 6502 varieties. In our code we used some commands that are specific to 65C02 so we have to tell that to compiler. There are several ways to do it but I like the approach to include it in the name by adding extension .65C02.asm to the file.
So open the Visual Studio Code, copy and the Assembly source code from Github or from above and paste it into editor. Save the file as RectFill.65C02.asm and using shortcuts you defined (I use F4 and F5) compile the file. You should get a .prg file that is 57 bytes long with first two bytes containing the address $0400 so Commander X16 knows where to load it.
If you didn’t get the Assembly environment setup yet I will provide link to download compiled file below.

Loading and calling the Function from Basic

We have several ways to load the Assembly program into the Commander X16 Emulator. We can start it with using parameter –prg and it will automatically load the Assembly program at the desired location. The actual command line will of course depend on your environment but if you put the compiled prg file in the same directory as the Emulator executable the simplest call would be:

x16emu –prg RectFill.prg

because we encoded the memory start address in the file itself the emulator knows where to load it to.
We can test it by setting the parameters using 6 POKE commands from above and calling the Assembly code using:

SYS $0400

And should see the following screen:

Alternatively we can also load the Assembly program from Basic directly using a Load command:


I haven’t had a chance to go very deep into LOAD command so I am not sure of all the parameters and what they mean. I expected that default device should be 1 but that doesn’t work. With current version of Emulator 36 the above version works.

To get a feel for the speed of the function we wrote let’s make a simple Basic program that will fill random rectangles on screen in a loop. Source code is below:

And the result is as follows:

The speed is clearly huge improvement that no trickery in Basic can come close to and with such an hybrid approach the possibilities for game development open up significantly. I hope this will encourage you to experiment some more and would like to see what cool Assembly functions you will come up with.

Back to Assembly in Basic I – Setting Up


Popular posts from this blog

Commander X16 Premium Keyboard

Hello VERA (BASIC vs C vs Assembly)

Direct VERA Access