Re: CBM900 hard disk timeout

From: Michał Pleban <lists_at_michau.name>
Date: Tue, 26 Aug 2014 12:26:17 +0200
Message-ID: <53FC60C9.8020000@michau.name>
Hello!

Michał Pleban wrote:

> That condition basically means that the controller did not respond with
> a status code at all. So it would rather mean timeout in communication
> with the controller, not timeout in reading from disk. If the controller
> wasn't able to communicate with the disk, it would show some SASI-like
> error code from the controller.

Bad news: I damaged the controller :-(

Good news: I am pretty sure now that the controller is the source of all
the problems.

Here's what happened. I reverse-engineered the BIOS startup code, which
performs these steps:

1. Suppress some error messages.
2. Read sector from floppy (command 08).
3. Set hard disk geometry (command 0C).
4. Read sector from hard disk (command 08).
5. Enable error messages.
6. Set hard disk geometry (command 0C).
7. Read sector from hard disk (command 08).

Step 1. disables displaying the "Timeout" message, as well as certain
errors on read sector, namely 0x84, 0x92 and 0x94. These are most
probably SASI error codes OR'ed with 0x80 (for exaple, 0x04 "drive not
ready", 0x12 "sector not found"). The BIOS initially does not show them,
so that there is no error displayed if the floppy is not inserted.

Step 3. is important as the BIOS sends some (hardcoded) values to the
controller such as number of heads, cylinders etc. If this step fails
(i.e. the controller does on return code 0x00), the famous
"Controller/drive initialization error" message is showm such as when
the controller is not inserted.

What is very important here that this step always succeeds if the
controller is connected. So the controller is alive at this point, to
the extent that it returns a proper success code via DMA.

Then if hard disk #0 is present, we start getting errors 0xFF (timeout)
as early as in step 4. Whereas if the hard disk is not connected, 0xFF
appears first in step 6., which means that step 4 executes properly too
(possibly returning error 0x84).

So I concluded there must be some problems with the controller, which
make it hang and stop responding to all further commands. So my idea was
to remove the controller, and attach it only after all the above
procedure is performed. There would be no commands issued to it, so
perhaps it would not be hanging.

I tried this once, but no success (0xFF all along). So I removed the
data cable from the disk, leaving only the control cable. I attached the
controller only after the BIOS onitor was displayed, and tried the "park
drive" option. Voila! The controller properly parked the drive, moving
the head visibly. Another park command failed with 0xFF again,
indicating that the controller hung. I tried several times again, with
plugging and unplugging the cables in various combinations, but the park
command never wurked again.

Unfortunately, at some point during this plugging and unplugging,
something broke in the controller and now it responds with error 0x75 in
step 4. It is less than 0x80 so I assume this is some internal
diagnostic error code, probably meaning that some chip is broken. This
happens regardless of whether the disk is atatched or not. Then it
dutifully hangs in step 6 as before.

I suspect that hanging of the controller firmware may be a reason of bad
8749 MCU, eaither because of a bitrot in the EPROM contents or some
other problem. I ordered some 8749's from Ebay and found a good soul
with MCS-48 programmer who will program them for me when they arrive.

As for the 0x75 error I introduced, I will try to replace the WD ASIC
chips on the controller, and hopefully it will be fixed. I suspect that
the WD2010 chip may failed. Hopefully it's not something more serious,
like faulty DMA chip, which would be unsolvable :-( I am expecting the
chip from Ebay to arrive in about 2 weeks and I will be able to perform
more diagnostics then.

Regards,
Michau.





       Message was sent through the cbm-hackers mailing list
Received on 2014-08-26 11:00:02

Archive generated by hypermail 2.2.0.