Date: 1998-04-28 18:55:41

Hallo allemaal,

Marko, received your VC10. I0m working on it.

Andre, did you receive the SRCs of the PRGs for the 8x96?

Here is another idea of mine: combining the 65SC816 and the 74LS612 in a
PET. So this doc also contains a lot of tech stuff. I also produced a GIF
containing the SCH but because of the problems we had when I sent the
contents of the ROMs of my German and US CBMs, I only sent it as a personal
file to Andre, Marko and Frank. So if you're interested as well, please
notify me. (or better ask one of the others as I will be 'out of the air'
for at least two days :-( ) 

One genearl question remains: What happens if the processor of a PET or
1541 runs out of phase with the onboard clock or even at a (slightly)
different speed? In case of the PET I think of the video and in case of the
1541 I think of the I/O for reading from/writing to the disk.



What is it?

BIG-PET is a project that enables you to expand your PET/CBM with the 65816
and a lot of RAM, ROM and I/O. Up to 16 MB if you want to! :-)
-     Emulation of new ROMs
-     Testing new hardware
-     Attaching a complete PC board so you can use PC-cards as well
Products: a card with hardware.

The idea.

Part one:

Since 1985 (?) the 65816 is available. This an upgrade of the 6502 with
internal 16 bit registers and capable of addressing up to 16 MB. The 20
MHz. version is used in the SUPER-CPU from CMD, a module to be attached to
your C64 or C128. AFAIK the 65816 is also used in the SNES from Nintendo
and the SG2 from Apple. It would be nice if we could use its power for the
C64/128 and their older brothers as well.

The 65816 also has a little brother, the 65802. This CPU is internal a 16
bitter and is pin compatible to the 6502. But due to this pin compatibility
it lacks certain features the 65816 has. The most important feature that is
missing is the ability to address up to 16 MB RAM, ROM or I/O.

<B>The pinouts</B>

      65816                                                   65816
       /VP       GND   -+  1               40 +-   /RESET
                        |                     |
                /RDY   -+  2               39 +-   CLK2        VDA
                        |                     |
     /ABORT     CLK1   -+  3               38 +-   SO          M/X
                        |                     |
                /IRQ   -+  4               37 +-   CLK0
                        |                     |
       /ML       NC    -+  5               36 +-   NC          /BE
                        |                     |
                /NMI   -+  6               35 +-   NC          E
                        |                     |
       VPA      SYNC   -+  7               34 +-   R/W
                        |                     |
                 +5V   -+  8               33 +-   D0
                        |                     |
                 A0    -+  9               32 +-   D1
                        |                     |
                 A1    -+ 10               31 +-   D2
                        |        6502         |
                 A2    -+ 11     65C02     30 +-   D3
                        |        65SC02       |
                 A3    -+ 12               29 +-   D4
                        |                     |
                 A4    -+ 13               28 +-   D5
                        |                     |
                 A5    -+ 14               27 +-   D6
                        |                     |
                 A6    -+ 15               26 +-   D7
                        |                     |
                 A7    -+ 16               25 +-   A15
                        |                     |
                 A8    -+ 17               24 +-   A14
                        |                     |
                 A9    -+ 18               23 +-   A13
                        |                     |
                 A10   -+ 19               22 +-   A12
                        |                     |
                 A11   -+ 20               21 +-   GND
                        |                     |

A0..A15   =  Addressbus
CLK0      =  Input, clock for the processor, also known as PHI0
CLK1      =  Output, clock of the processor, also known as PHI1. 
                  Is inverted CLK0, about 3 ns. delayed
CLK2      =  Output, clock of the processor, also known as PHI2
                  Is CLK2, about 6 ns. delayed
D0..D7    =  Databus, Addresbus A16..A23 (65816 only)
IRQ       =  Input, maskable interrupt
NC          =  Not connected
NMI       =  Input, non-maskable interrupt
R/W       =  Read/Write-line
RDY         =  Input, ReaDY, causes th CPU to wait until released. Does NOT
                  work during a write-cycle for the original 6502. Does
                  work for the 65SC816, 65SC802 and 65SC02 of GTE and CMD.
RES       =  Reset-line
SO          =  Input, signal to set Overflow-flag of status register
SYNC        =  Output, signals fetch of opcode, active (H)


ABORT       =  Input, prevents modification of the internal, causes 658xx
                  to call vector at $00FFF8/9
BE          =  Input, Bus Enable, puts 65816 in tristate
E           =  Output, reflects the state of the Emulation bit
M/X         =  Output, reflects registers used as 8 or 16 bit wide
ML          =  Output, Memory Lock signals outer world read-modify
                  instruction is executed.
VP          =  Output, Vector Pull

VDA         =  Output, Valid Data Address
VPA         =  Output, Valid Program Address

As you can see the 65816 is not completely pin compatible but the
incompatibility is minor. The most important difference is the lack of CLK1
and CLK2. After some experiments I found out that I had to generate these
signals by using two 74LS14 gates. I modified a card produced by
Elektuur/Elektor in such a way that I only had to replace the original 6502
of any system by this card to let it run on a 65816. This system worked
fine for Acorn Atom, CBM8032SK, PET8032 and PET3032 but NOT for the VIC20.
Until now I haven't figured out the reason for this behaviour. (Delay is
too long so the VIC-chip messes up the bus at the end of the generated

Part two:

Andre Fachat had the genious idea using a 74LS610 Memory Management Unit in
his <A HREF=''>the CS/A65</A> enabling him to expand his system up to 1 MB.
This does not mean that suddenly the processor is capable of addressing
this 1 MB. The 1 Mb is divided in 256 blocks of 4 KB of which only 16 are
accessible by the 6502. For more details how the 74LS610 functions, see my
document about its brother, the <A HREF=''>74LS612</A>.

The special feature where I am interrested in is the ability to shuffle
those 4 KB blocks in any order you want to. In C64 terms: you are able to
move the I/O area from $D000/DFFF to $6000/$6FFF. This last feature I want
to use in BIG-PET as well because it enables you to use your CBM or PET as
a big 6502 emulator for other systems.

Part three:

The next trick is to combine the capability of the 65816 of addressing 16
MB and the shuffle feature of the 612. Why not only using the 612 for
addressing the complete range? Using the 65816s own capability to address
16 MB enables us to read from/write to areas not covered by the MMU at that
moment. This also enables us to mirror the MMU in another 64 KB segment
meaning we can let it disappear from the original first 64 KB (could be
needed when emulating another system) but we still would be able to
reprogram it when needed.

Part four:

The 65816 is available as fast as 12 MHz (20 MHz.?) and it would be a waist
not to use this speed. As I already mentioned, BIG-PET is going to be used
in a CBM or PET and they standard run at 1 MHz. The onboard chipset
disallows us to run it at a higher clockspeed. But that does have to stop
us using faster chips in other 64 KB segments so that they can be used
running at higher clockspeeds. I obtained a handfull of 45 ns. 32*8 KB
SRAMs, former Cache-RAMs of obsolete 80386 motherboards. These can be
aproached at 4, and even 8 MHz. It should be possible to copy the contents
of the ROMs and the RAM in area $0000/$7FFF to this fast RAM and then use
this instead of the original ROM/RAM. (Like shadow-RAM on PCs)


Addressing 16 MB by the 65816 AND the 74LS612 MMU:

The addresslines A16..23 of the 65816 have been multiplexed with the
databus and are not available in the normal way as the other addresslines.
The normal procedure to generate these lines out of the databus is to latch
them using a 74F573. CLK1 can be used to perform this latch. CLK1 is
generated from CLK0 using a 74F04 (U3a).
The MMU has its own lines to address all the 16 MB. As we cannot have two
devices driving the addressbus at one time, we either have to choose
between the 65816 or the MMU. My idea is that the MMU only drives the
addressbus when the first segment, $000000/$00FFFF, is selected. This can
be archieved by 'ORring' A16..23. Here for I use a 540 (U1) to invert all
A16..23 and a 133 (U8) to AND the inverted signals. As the 133 actually is
a NAND gate we first have to invert its result with an inverter(U3b). The
result is used to select either the 541 (U7) buffering the 573 (U6), or the
MMU (U4). (We cannot tristate the 573 itself because we will loose the
information causing the tristate) Because the MMU also takes care of
A12..15, we have to bypass the MMU for these lines by means of another 541
Because we want to be able to disable the whole configuration, we take
advantage of the fact that the 541s have two enable-lines. One of them we
use for the above mentioned selection, the other for disabling the whole
bus. The MMU lacks this feature so we have to use an OR-gate (U5a) to
combine the 'disable'-signals. The signal to disable the whole bus
originates from a 04 (U3e). Its input originates from /BE of the 65816.
(which is an input)

Addressing the MMU:

To be able to program the behaviour of the MMU, it has to be fit somewhere
into the mappings of the 65816. As mentioned before, it will be mapped any
way into another segment then the $00-segment. In this way we always are
able to (re)program it using the 65816s extra capabilities. The CBM/PET has
2*4 KB of unmapped memory meant for adding extra ROMs with additional
software. One of these areas can (partly) be used to map the MMU. Should we
need this specific area for testing a new ROM, then we physically map this
ROM in another segment and remap it the needed 4 KB area by means of the
MMU. From that moment on reprogramming the MMU is now only possible by
addressing it thru the other segment. 
In the future I also want to make a C64 version. The C64 has only 512 bytes
of free space. Mapping the MMU into this area has the disadvantage that we
won't be able to attach regular cartridges using this area. (Unless I find
a way to remap only 512 bytes) The advantage of mapping the MMU into the
existing memorymap is that you can program it in BASIC . (BASIC is not able
to address beyond the first segment)

I decided to map the MMU in segment 0 and 1. This is done by using a
74LS677 (U12), a 16-bit comperator. This IC checks if A17..23, A7, A9..10,
and A12..15 are (L), and if A8 and A11 are (H). Notice that I did not
mention A16. By discarding this line I created the needed mirror in segment
1. The output of the 677 becomes active (L) in the range $9000/9FFF and is
used to enable a 138 (U9) which on its turn enables the MMU. 

Programming the MMU:

If we want to use the complete ability of the MMU of addressing 16 MB, then
we must be able to program all the 12 bits of all the registers of the MMU.
As the 65816 is only an 8-bitter, we have to create the additional 4 bits
ourself. My solution is to use either a 6522 or a 6526 to deliver these 4
bits. It looks a bit like using a riotgun to kill a mouse but the 6522/6
will be used for more tasks.
Using an external latch has one disadvantage, we won't be able to read the
contents of those four bits without extra hardware. The extra hardware
needed in this case are a 04 and a 573. This IC latches the data written
to/read from the extra four bits (D'0..D'3). The data is latched whenever
the /CS-input of the MMU is enabled. The outputs of the 573 are connected
to the same four I/O lines of the 6522/6 as well. Another line, D'5, of the
6522/6 takes care of enabling the output of the 573 towards the 6522/6. 
-     We only must take care of the fact that the used I/O lines are
      programmed as inputs the moment we output the data of the 573. 
-     The moment we want to output the data of the 573 towards the 6522/6
      the I/O lines must be programmed as inputs.
-     The moment we want the 65816 to read data from the MMU, the I/O lines
      and the 573 must be disabled.
-     The moment we want the 65816 to write towards the MMU, the 573 must
      be disabled. The I/O lines must have been programmed as outputs and
      have been filled with valid data.

One line, D'4, we'll use to put the MMU in map-mode or not. We have to take
the MMU out of the map-mode the moment we want to (re)program it. After a
RESET all the I/O lines have been switched to input and a resistor takes
care of pulling the MM-input (H), disabling the map-mode in this way.

The extra port of the 6522/6

Until now we only needed 6 of the 16 available I/O lines. As I said I want
to devellop a "Big-PET" version for the C64/128 as well. In that case we
need something to emulate the onboard port of the 6510/8502 and by 'pure
coincidence' we have a complete port available for this purpose. Take in
mind that only bits 0..5/6 are used on the 6510/8502 but we cannot use bit
7 for other purposes as we have no idea how existing software threats this

Ideas for using the Big-PET

1) Testing new ROMs.

We map RAM to the area normally used for ROM, after having filled it with
our own program. One remark regarding the C64/128: remapping any of the
ROM-areas also remaps the underlaying RAM. This means that SW writing to
this RAM will destroy the existing data in the remapped RAM!

2) Emulating other 6502-systems

We replace the original 6502 of the system to be emulated, by example a
1541 disk drive, with an interface which is nothing more but a 40 pins
connector, a piece of flatcable and some buffers on a card. The main idea
is to let the 65816 do the job the original 6502 normally did. But instead
running the PRG in external systems ROM, we run our own PRG in for this
case remapped RAM.

One problem may occur an that is what clock should we use: the one of the
PET of the one of the external system? I think that is system dependent and
I have no idea what will happen if the 65816 runs out of phase with the
onboard 16 MHz (which is used to drive the video) or out of phase with the
onboard clock of the external system.

3) Using PC cards

One idea is attach a complete PC board to our Big-PET. In this case I'm
thinking of XT boards fitted with the 8088 or NEC V20 ie. the 'external' 8
bitters. Connecting boards normally fitted with a 80x86 or V30 implies we
have to find a mechanism to read from/write to the 16/32 bits databus.
A problem is that 8088 uses a multiplexed bus. But in this case the databus
and the addresslines A0..7 have been multiplexed. There are two solutions
for this problem:
-     A XT uses several ICs to generate all the signals as we know of:
      8284, 8288, 74LS245, 74LS573. Take these ICs out of the board and
      supply our own signals there where needed. Problem: only works with
      boards not fitted with those big square custom ICs who are meant to
      replace the above mentioned ICs.
-     We mix the address- and datalines on the interface card and generate
      all other signal to emulate a 8088.
What ever solution we use, we still have (at least) one problem to solve.
The XT is capable of addressing 1 MB of memory but also capable of
addressing $400 bytes of I/O. My solution is either to use an extra segment
for addressing this I/O or using a part of the fast I/O segment mentioned
above for this purpose as well. 

One remark: I do have the complete disassembly listing of the original IBM
XT-ROM. In this way we only have to translate the 8086-code to 6502-code to
program the onboard ICs.

Higher clockspeeds:

I have some 5 MHz. 65SC816s so I would be interrested running them at least
at 4 MHz. as this frequency is available in all CBMs and PETs. The main
idea is to extend the positive halve of the clock towards the CPU as long
as needed:

        -+   +---+   +---+   +---+   +---+   +---+   +---+   +---+   +--
 4 MHz.                                                 
         +---+   +---+   +---+   +---+   +---+   +---+   +---+   +---+  

        -+   +---------------------------+   +--------------------------
 1 MHz.                                                             
 65816   +---+                           +---+                          

        -+               +---------------+               +--------------
 1 MHz.                                                          
 system  +---------------+               +---------------+ 


The 'cripled' 1 MHz signal for the 65816 can be generated by ORring the 4,
2 and 1 MHz. signals. The actual ORring must become active as soon as the
hardware has detected the right area at point (A). 
As mentioned before, the normal procedure to generate the addresslines
A16..23 out of the databus is to latch them using a 74F573. But the state
of these addresslines already are stable 'long' before the rising edge of
CLK0. At least long enough to have the MMU stabilised his ouputs at this
point as well. In case this does not work out well, the system should at
least run at the 'cripled' 2 MHz.

What part of the 16 MB range should be run at 1 MHz. and which part not?
With exception of the very first segment, the choice is up to you. I'm
thinking of using a 74LS138 as following: the first fysical 2 MB run at 1
Mhz., the rest at full speed. 
At least one segment of the 'slow' and one of the 'fast' area I will
reserve purely for I/O-puposes. The fast I/O segment can be used for ICs
like the 16550 which are able to run at 4 MHZ. or higher. At least one
segment of the slow area will be reserved for emulation purposes. (see
At least one segment of the 'slow' and one of the 'fast' area I will fill
only with RAM. Why this differentiation? Using only fast RAM as shadow-RAM
can have its drwabacks because there are programs using program-loops for
timing purposes and these loops depend on the internal clock. So running
the program in fast RAM could have unforseen impacts.

Remark regarding attaching a XT board: In an XT the I/O and ROms normally are
addressed at a lower speed as the RAMs. But we don't have to worry about
that because the onboard logic takes care of this by normally slowing down
the CPU. We only have to take care of the fact that we connect the used
'slow down' signal with the 65816 as well.


Groetjes, Ruud

Archive generated by hypermail 2.1.1.