Re: strange 2001N fault

From: Gerrit Heitsch <gerrit_at_laosinh.s.bawue.de>
Date: Tue, 20 Feb 2018 19:28:23 +0100
Message-ID: <d958837c-470d-356a-049a-cb2e386bb3ef@laosinh.s.bawue.de>
On 02/20/2018 07:07 PM, Francesco Messineo wrote:
> On Tue, Feb 20, 2018 at 5:34 PM, Gerrit Heitsch
> <gerrit@laosinh.s.bawue.de> wrote:
>> On 02/20/2018 11:53 AM, Francesco Messineo wrote:
>>>
>>> Hi all,
>>> in case someone doesn't read vcfed forums, my 3032 developed a strange
>>> fault:
>>>
>>> http://www.vcfed.org/forum/showthread.php?62202-3032-(2001N)-strange-fault
>>>
>>> The thread link is just to recap what I've found and what I did, so I
>>> don't have to type all again here ;-)
>>> Ideas welcome, I'm just scratching my head. This is probably where a
>>> logic analyzer is needed badly.
>>> It seems to me there're unintended writes on some ram locations (not
>>> so random, address wise) and it makes me think about VSP crashes on
>>> C64.
>>> Any idea where to look for a fault?
>>
>>
>> Address multiplexer? I think on those boards they used the 74LS153 which
>> contains two 4-to-1 multiplexer. Also check their control circuits.
> 
> yes, 4 x 153, they carry the refresh addresses and row/column addresses.
> All 4 look fine, also control signals as far as I can tell.
> The fault is intermittent, so it's really hard to catch with a dual
> track scope. I think I'll leave this board alone
> for the moment, it's that kind of fault  that really requires a logic
> analyzer I'm afraid.
> As far as I can remember, DRAM can be corrupted during a read
> operation (or even refresh?) if addresses change at the wrong time,
> right?

Yes, that's why I suggested the multiplexers or their control circuits. 
If you have a dual trace scope, you can do a lot to hunt down such a 
bug. First check the Multiplexer inputs against /RAS, the latter should 
be low for a bit before the multiplexer control line switches.

You only need to check /RAS since that's where the data corruption can 
happen, it can't happen with /CAS.

Next check the multiplexer outputs to the RAMs in relation to /RAS. 
Trigger on /RAS going low.

The reason why a changing address line (or lines) to close to /RAS going 
low can corrupt data even in a read or refresh cycle is how DRAM 
operates. Every memory cell is a small capacitor and a single 
transistor. Every read/refresh access turns on that transistor and puts 
the charge on the capacitor on the input of a read amplifier. This of 
course drains the capacitor. So the circuit inside the RAM has to write 
the data it read back at the end of the cycle. That's all fine as long 
as only 1 transistor per read amp is selected (a 4116 has 128 read amps, 
which one gets to deliver the data to the outside is selected by /CAS). 
But if the address bits changes too close to /RAS going low, it can 
happen that you get 2 or more rows selected at the same time. Meaning 
each read amp finds itself connected to 2 or more memory cells. If those 
cells don't have the same data (*) in them, you will get data corruption 
when the data is written back.

(*) Oh... also, an empty capacitor in a memory cell doesn't necessarily 
translate to '0'. If you take a computer that doesn't do a real memory 
test at boot time (like a 264 system) and look at the memory after 
turning it on you will notice patterns that depend on whoever made that 
RAM. So some cells will read '1' when empty. That makes it almost 
impossible to determine what type of corruption to expect in the case 
desribed above.

  Gerrit


       Message was sent through the cbm-hackers mailing list
Received on 2018-02-20 19:01:34

Archive generated by hypermail 2.2.0.