Re: D64 with IDE64 and a parallel connected 1541... record beaten!?
Date: 2005-11-22 19:14:04

On 2005-11-22, at 18:01, MagerValp wrote:

> If you look at the IDE64 performance measurements:
> you'll see that IDE64 can do sustained writes at about 40-45 kB/s.
> That leaves about 10 seconds for GCR decoding and drive delays (track
> and sector seek, etc). At about 44 cycles per byte, I'm guessing
> there's still a little room for improvement, though the effort needed
> might not be worth it as it'd be hard to gain more than maybe 5 secs.

When I started writing this, my target was about 20 seconds. But when  
I got the first alfa running, I was quite disappointed with the  
results as I expected optimisations to begin at more or less 25  
seconds and not 29, which I achieved.

Now I believe I'll be able to meet the target but even if I remain at  
21-22 seconds it will still be acceptable for me... as long as noone  
writes a faster one, that is ;-) but of course I wouldn't mind making  
it even faster right now! There is still a lot of minor optimisation  
possibilities that I am aware of but those are not that important.

> And still: you can't compare streaming raw gcr data without error
> checks from one drive to another with fully decoding sectors.

Again - I'd be very glad to learn about those fast GCR routines  
Antitrack is referring to.

> And to get this thread back on topic: does anyone have fast, table
> based GCR decoding routines? Patrycjusz, what does your solution look
> like, if you don't mind sharing?

Sure. Long answer: I started with 1571 ROM assembly sources found on  
zimmers but since they are barely commented and also lacking some  
parts, namely some of the referenced tables, I decided to write my  
own. I spent a considerable part of one of the recent weekends  
writing my GCR tables from scratch and once I wrote the routine and  
counted cycles I found out that it's still some six to seven percent  
slower than the one in 1571's ROM. Therefore I took the 1571 routine  
and used as a drop-in replacement. This made the immediate impact of  
dropping down from 29 to 25 seconds. That was already a good start  
but last weekend I optimised it a bit by inlining the main part of  
the code and removing unnecessary use of temporary storage. Short  
answer: I use the (faster) 1571 routine inlined and without temporary  
BIN storage - the decoded BIN quartets go directly to output bufffers  
rather than ZP locations.

  Every programmer knows the answer: $2b or (not $2b) is $ff.

       Message was sent through the cbm-hackers mailing list

Archive generated by hypermail pre-2.1.8.