20170922

Aline: Colors in sync

Summary

Traditional challenges associated with using multicolor in ZX Spectrum programs include:
  1. Writing precisely timed display code.
  2. Adjusting the synchronization timings by hand to execute it at the right moment.
  3. Keeping the rest of your program from influencing the overall timing.
Aline, the technique presented in this article, aims to get rid of problems 2 and 3 altogether.

 

Explanation

Aline is the idea of synchronization by floating bus reads taken to its logical conclusion. Originally, reading from an idle 'ULA' port has been used by a number of ZX Spectrum games instead of relying on interrupts, mainly in order to allow more time for drawing in the frame. However, we have found that it is possible to achieve stable sync within single T states of precision using this mechanism.

This article will primarily discuss synchronization with the top of the screen area, which is usually what is sought after in most cases. Other algorithms that would allow synchronization to an arbitrary character line appear feasible as well but more complicated.

Let's take a look at the rundown of the floating bus fetch cycle pattern over at the Sinclair FAQ Wiki. It can be seen that the pattern isn't random and there is a correlation between screen addresses and their contents, and values returned at particular T states. Therefore if we fill some bitmap/attribute lines with uniquely identifiable sync marker codes, it becomes possible to unambiguously tell where we were at the time of reading. 

The idea behind aline is reinterpreting these sync codes as actual complementary T state delay times from the current raster location to the end of the line. When one of these is read back, execution is delayed for this many T states, therefore resuming at the same constant T state at the end of the line.

In order to be able to do this, we need a way to make sure that we always have at least one bitmap/attribute read off the very first scanline of the screen. This can be accomplished by making the input loop take an odd number of T states that is close to a multiple of 8, while being reasonably fast. In this way, the execution would constantly 'drift' relative to the floating bus pattern and eventually arrive at either the bitmap or the attribute area in the pattern. Before that happens, the idle values of 255 from either the border or empty parts of the pattern that are encountered in the loop are simply ignored. A single loop iteration of 25 T states (8*3 + 1) can repeat 4-5 times over the course of the active screen area on a line (128 T states), which is enough for this purpose.

    ld e,2
    call .sync
    ...
.sync
    ld hl,.sync_lp
    ld bc,#FFFF
.sync_lp
    in a,(c)
    cp e
    ret p
    jp (hl)


Regarding initializing the top line with sync marker values, keep in mind that it is not necessary to fill every attribute and bitmap byte. The only requirement here is making sure that the 'pattern distance' between two marker values does not exceed 4. The rest of the bytes must be set to zero or any other value that would be ignored by the loop.

Notice that within this scheme, it is expected that the aline algorithm would be running exclusively during the lower/top border time, which might not always be the case. Therefore, it would need to be supplemented by another piece of code that would delay execution until the raster is on the lower/top border. The following or similar code is included with all aline versions.

SyncBorder
    ld hl,#4000
    ld (hl),#FF            ;+2A/+3 workaround (see below)
    ld d,4                 ;D = number of consecutive reads
    ld bc,(Aline.LOC_Port)
.lp1
    ld e,d
.lp2
    ld a,(hl)              ;+2A/+3 workaround
    in a,(c)
    inc a
    jr nz,.lp1
    dec e
    jr nz,.lp2
    ret

Finally, due to the nature of this technique, some attribute artifacts might be left visible in the upper scanlines. The standard implementation of aline attempts to mask this for the first two scanlines, visually leaving them black. If the calling program begins filling the screen with full-width multicolor data immediately afterwards, no artifacts would be left displayed on the screen. If the width of the multicolor area is less than that, additional work might be necessary in order to set the edge attributes to zero. The 4T mod variant of aline discussed below does not attempt to mask the attribute artifacts at all, since in that case attributes are used exclusively, with 32 sync codes in total, which does not allow zeroing the INK or the PAPER bits in the codes.

The 'Color' variant of the standard aline implementation allows changing the color of the sync marker area. The sync marker code assignment method is changed such that the resulting attribute artifacts are kept to a minimum, only affecting the BRIGHT setting of each other attribute cell over a half of the sync marker area horizontally. If the color is set to black, no artifacts would be left visible, as in the standard edition.

Advantages

  • Multicolor-compatible synchronization in a simple manner similar to issuing a HALT
  • Free CPU time can be used efficiently and the restriction on timing-uncompensated branching is lifted
  • A multicolor program can be written like any other with comparatively few special considerations expected of such productions, outside of tuning the display code portion itself
  • As the result, the potential complexity of multicolor software is greatly increased


Disadvantages

  • Requires ZX Spectrum models that implement some form of floating bus functionality
  • The available screen space is reduced by up to 3 scanlines depending on the algorithm
  • Alining to somewhere other than the top of the screen in this manner is more complicated
  • Upper scanline artifacts that may or may not be masked depending on the algorithm

Compatibility

 

Sinclair and Amstrad ZX Spectrum models

So far, aline is confirmed to work as expected under emulation on all Sinclair ZX Spectrum configurations up to and including the +2 (SpecEmu, Spectaculator, Fuse, Spectramine), as well as on the Amstrad ZX Spectrum +2A/+3 configurations (SpecEmu, Spectramine) with some limitations.

Regarding the +2A/+3 models, it was discovered recently that it is possible to access the floating bus functionality on these. Specifically, the fetch pattern was found to be similar to that of the Sinclair models, with a few notable differences:
  • It responds to port addresses with the mask 0000XXXXXXXXXX01b in 128K mode
  • The values returned from the floating bus port have the bit 0 set
  • The value of an attribute preceding an idle portion in the pattern is returned instead of 255
  • The border idle value is changed after a contended memory access
A single iteration of the +2A/+3 fetch pattern over the paper area might therefore look like this (Bitmap, Attribute):

B0 A0 B1 A1  A1 A1 A1 A1

The first two points listed above are accounted for in the standard implementation of aline. The third point on the other hand diminishes the usefulness of fourth attributes in the fetch pattern, effectively reducing the synchronization precision down from 1 to 5 T states. It must be noted, however, that this reduced precision is normally sufficient on practice given the nature of application for such a method, and factors such as memory contention. As well, a special algorithm to get around this issue appears likely to emerge at some point.

Another implication here is that there's no default idle value that is returned during border time on the +2A/+3. What would be normally read in this case is rightmost attributes of each individual character line. However, if a contended memory address is accessed by the CPU during border time, the idle value that is returned from the fb port is set to the contents of that address. This new idle value remains in effect until the raster is passing over the screen area again. The standard implementation of aline accounts for this as well.


Unofficial 'attribute-only' floating bus mods

In theory, a modification of the algorithm would allow for a variant of aline for Spectrum-compatibles with simplified-function floating bus mods that only return attribute values each 4 T states. At most, 3 reads are required in order to obtain 1T precise synchronization. It must be noted that this form of aline isn't readily interchangeable with the algorithm for the original machines since the latter only requires a single read.

The scheme is as follows. Keep reading the floating bus port until we have a non-zero read, A. Delay execution for 'TStatesPerLine + 2' T states, essentially moving 2 T states further relative to the pattern, and do a second read of the port, B. At this point, there are two branches depending on whether A=B:
  1. If A=B, we are currently at T states 1 or 2 out of 4. In order to make it clear, a third read C must be made after 'TStatesPerLine - 3' T states. If C=A, we are at T state 1/4. If CA, we are at T state 2/4.
  2. If AB, we are currently at T states 3 or 4 out of 4. In order to make it clear, a third read C must be made after 'TStatesPerLine - 1' T states. If C=A, we are at T state 3/4. If CA, we are at T state 4/4.

A=(+0)
B=(+2), 'T states per line + 2' later
A=B: case 1: 1/4 or 2/4

---CA-B-----
----CA-B----
000011112222
    C=(-1),
'T states per line - 3' later
    C=A: 1/4, else 2/4


AB: case 2: 3/4 or 4/4

------ACB---
-------ACB--
000011112222
    C=(+1)
, 'T states per line - 1' later
    C=A: 3/4, else 4/4

As stated before however, even reduced precision (4T in this case) would still often be sufficient on practice, and it can be achieved with a single read exactly like in the original algorithm. In both cases, only the attributes need to be set up as sync marker values.

An expected downside to both attribute-only aline variants is that it might not be possible to mask the attribute artifacts in the upper scanlines of the screen since both methods are entirely attribute-based.


Resources

The source code for aline is available here. The following files are included:
  • aline.asm: standard implementation
  • aline-color.asm: allows changing the color of the sync area
  • aline-fast.asm: uses unrolled (speed-optimized) setup code
  • aline-attronly-4t.asm: unofficial 'attribute-only' floating bus mod edition (4T precision)
  • test-aline.tap: a simple test for aline using the 8x1 multicolor mode

A demo of the CATS Mint engine by Intense is available for an example of using aline.
Download archive (.TAP)
Watch demonstration


Special thanks

Chernandezba, Ast A. Moore, Your Spec-chum, Woody, Weiv and everyone else of the +2A/+3 floating bus testing effort over at the World of Spectrum forums.


Revisions

20171124
  • Added the 'Color' aline variant
  • Expanded the Resources section
  • Added .TAP version of the CATS Mint example
20170922
  • Initial publication