From: Scott McDonnell (NetSamurai_at_comcast.net)
Date: 2005-08-16 04:04:01

```----- Original Message -----
From: "Laze Ristoski" <cybernator@mt.net.mk>
To: <cbm-hackers@ling.gu.se>
Sent: Monday, August 15, 2005 8:25 PM

> > First of all, the
> > amplitude of the actual data far exceeds the noise.  The output of the
> > tape head is run through a filter which helps reduce out-of range freqs.
>
> Uhm... I'd have to learn how filters really work. Till then, case
dismissed.

There are many tutorials on them. Simple if you know basic electronics.

>
> > Usually a 0 is twice the frequency of a 1.
>
> It really depends on the format but yes, usually a 0 is twice the
frequency
> of a 1.

Usually, yes. Most often, yes. Always, no..

>
> > Since only the transitions matter, frequency is being measured, not
pulse
> > width.
>
> Well, Turbo 250 does it this way: the timer is in one-shot mode and is
> restarted
> after each _received_ click (each second that is). Then it's checked
whether
> the timer reached 0 meanwhile. If it did, then the pulse was longer than
263
> cycles,
> otherwise it was shorter than 263 cycles. So I'm measuring the pulse width
> here, not
> the frequency, right? Although the frequency depends on the pulse width,
> it's
> kinda more difficult to measure.

Ok here's the misunderstanding. We are saying the same thing two different
ways. Frequency depends on a period. A period is the length of time it takes
for a voltage level to 'revolve' back to its starting point:

__
|   |__|
^      ^ Period

The above (crappy) ascii art shows a full period. How many times a period
is repeated in a second results in frequency. The example is using a 50%
pulse
width. This means the crest and trough of the wave are equal in lengths of
time.
The frequency can remain the same as the pulse-width could be anything from
1% to 99% This is simply the time that the wave is sustained at its peak.

Using frequency by itself in magnetic media would be very messy, since the
physical media itself can change. Any encoding scheme that doesn't take
this into account is sacrificing data integrity for speed. This can
sometimes be
made up for by using checksums, error correction bits, etc..

Let's suppose we have an edge-triggered timer (which is exactly what we do
have.) If it is negative edge triggered, then the timer does nothing  until
there is
a high-to-low transition:
__
|   |__|
^ Triggered here

So the timer triggers and starts counting. It is reset by the next high to
low
transition. This is not measuring pulse width, but frequency, since as I
stated
before, a period is the time it takes to 'revolve' back to the starting
point (the
high-to-low transition.) By taking the counter accumulator and using the
MCUs clock cycle (or timer clock cycle) you can determine how long the
period lasted in seconds. This will give you the frequency. This process is
nothing magical, in fact it is the way all frequency is measured digitally.
To
digitize something means to quantize one of the properties, in this case
that
is time.

Using the above method, you should be able to see how frequency was confused
with pulse width. The zeros are in fact, twice the frequency of the ones.
Actually
looking at them, though, they appear to be only varying in pulse widths.

The advantage to having the sync bytes (the zeros at the beginning of data)
is that
the computer can establish one of the frequencies to compare it to the
other. In this
way, if the tape speed has changed for whatever reason, the data integrity
is intact.
What may be different between methods is how often these sync "characters"
occur.
They may only appear in the very beginning of a tape as a lead-in, though
this
would be somewhat risky since the load on the motor will change as the tape
spools
shift in weight (from filling up.) These sync characters should show up
quite
frequently.

>
> > A byte is usually padded with zeros, to get the clock to
> > synchronize.
>
> Not necessarily, Turbo 250 doesn't do that. The format is explained below.

I am a hardware guy, not much of a software guy. I would, however, be very
wary of using a format that did not use the above safeguards to keep my data
intact. At least if that data is considered important to me. And as I said,
it
may not need to happen with every byte (though that would offer the most
integrity, it would slow things down.) There should be sync characters
somewhere (it might be entirely handled by the datasette itself, so you
would
see nothing in the code to expose it.)

>
> > This is also why the bits are repeated twice.
>
> I still think this is because every second pulse remains unnoticed by the
> CPU.

Well, although I probably explained the above quite poorly, I submit that
you are seeing the bits twice to get the high-to-low transitions. Which
keeps things in sync.

>
> > Check here for a more detailed discussion of magnetic data storage
(meant
> > for cards,
> > but as I said, the process is very very similar.)
> > http://www.phrack.org/phrack/37/P37-06
>
> Is this _our_ Count Zero? :)

Don't know...but most likely.

>
> Ok, the Turbo 250 format. It starts with a bunch of bytes with value 2
> (100+),
> which are used to synchronize (byte-align). Then follows a count down:
> 9,8,7,6,5,4,3,2,1, which is used to check if synchronization was achieved.
> And then goes the data itself. (note that I didn't mention header blocks,
> checksums,
> etc. since they are irrelevant to this topic). Every byte is saved as
> a stream of SS and LL pairs representing the bits. There's nothing for
> control
> in-between bits/bytes.

I see. Even though, it is still using syncing characters. I oversimplified
things when I said padded with zeros (actually, I was stuck in the context
of mag stripe cards.) GCR, I think, works like your example above.

>
> Now the loader reads bit-by-bit and shifts these bits. After every bit,
it's
> checked whether the value of the byte is 2. If not, repeat. Once it
detects
> value 2, it starts reading byte-by-byte (nothing is checked until a group
of
> 8 bits is read). If this value is 2, it reads again, and again until
> something
> different than 2 is read. (this is done in order to skip the
synchronization
> part). Now bytes are read and they are compared against the countdown.
> If the countdown sequence is correct, that means the synchronization was
> successful, and the loader continues to read data (again, no control
> pulses in-between). Otherwise, the loader goes into bit-by-bit mode
> again, and the process is repeated.
>
>
> Apparently, every extra (noise) click would cause a load error.
> Say I've got a byte with value 255 recorded.

Well, there are two things going on here. First is the physical access. That
is the strength of the signal vs. noise (SNR) The amplitude of the signal is
much louder than the noise:

___
|     |
_|_|__|_|__|__|     |____
^Noise            ^Bit

An edge detector has a HI and LOW logic level. If the pulse amplitude
doesn't
fall into one of those ranges (certainly not the high) then it is ignored.
There is
purposefully a gap between hi and low logic levels. Anyway. The noise is
filtered
out, which won't REMOVE it, but will lower its amplitude even more than it
was.

This is then amplified, which means make the entire head signal larger, thus
pushing
the higher-amplitude bit to logical levels. This is also done to clip the
top off the
pulse, since noise can ride on this and possibly trigger things. Usually a
shaping
circuit is also involved to make the pulses a little more square than they
were.
Becuase of the motion of the tape, it would be very difficult to have a
square wave
recorded directly.

The second thing going on is the media access layer which is described above
with
"twice-the-frequency" stuff. The hardware and this media layer combined
should
do a pretty good job of keeping noise from causing trouble. If your example
results
in errors, it is because it is not syncing often enough. Though it should
never happen
since only the high-to-low transitions are actually counted.

>
> 163,163 163,163 163,163 163,163 163,163 163,163 163,163 163,163 (lots of
> LL's :))
>
> Now assume an extra click appears between the first pair. Say the pulse
> width is
> 80. We end up with: 163,80 83,163 etc...
>
> 163+80=243, which is below 263 and is interpreted as a 0. Then 83+163=246,
> another
> 0. Apart that we read some wrong bits, this extra click will cause
> misalignment later.
> So any kinda of noise must be COMPLETELY eliminated. Nothing in the
encoding
> really
> does anything to compensate for the noise.

If you can find a way to COMPLETELY eliminate noise, you will be labelled a
genius
and have no shortage of offers of employment. It is simply impossible. Thus,
we
engineering geeks rely on something called SNR which is Signal-to-Noise
Ratio this
defines the difference between actual signal amplitude to noise amplitude.
It is kept
high on purpose. In tape players one of the more accepted ways is to bias
the signal.
This is done by purposefully ADDING noise into the signal and then
extracting that
noise later.

>
> As for the first issue, I'm still unsure. Maybe recording the signal from
> the
> tape (as a WAV or something), and having a look at it could reveal some
> clue.

Your chances of gaining any scientific understanding of what is going on are
next to impossible with this method. Digitizing the tape into a WAV adds
quantization errors which will end up looking like noise.) A spectrum
analyzer or oscilloscope would be the only useful way to analyze this.

But again, I understand why you think it makes sense, but it would take some
very serious noise (or very crappy tape and/or player) to disturb the
digital
signal. Digital pulses often are meant to use the maximum amplitude that can
be recorded without major distortion.

>
> I'll give it a shot.
>
> Regards.
>
> --
> Laze

Scott

Message was sent through the cbm-hackers mailing list
```

Archive generated by hypermail pre-2.1.8.