As you are all asking so many questions, I will
post the Amiga internal report on SCSI - we didn't
just pick SCRIPTS because it sounded cool ;-)
------------------------------------------------------------------------
AmigaOne SCSI standards
***********************
Fleecy has also asked me to recommend a PCI SCSI
card or chipset on which to standardise, and I'm
fairly sure I know the right ones to pick. I've been
a SCSI enthusiast since 1987, and written drivers
for (very) dumb and smart controllers. I do understand
the options and the issues, as I'll show. We should
standardise on an 'NCR SCSI scripts' processor, for
the following reasons.
These are low-cost high-performance single-chip PCI
bus-mastering SCSI controllers. They are widely
available in SCSI 2 FAST and Ultra/SCSI-3 versions,
originally from NCR then Symbios, now a brand of
LSI logic, http://www.lsilogic.com/ . There is a
good OEM support program there - another reason for
chosing this rather than an Adaptec proprietary
part, for example - and they encourage people to
buy either boards with chips on (e.g. from lots of
firms in China, Korea and Taiwan) OR the chips
themselves for integration (as on Commodore's
A4000T) so programming and electronic specs are
both good and easy to get. Hence they are widely
supported with free and compatible drivers, and
as close to a commodity as controller chips get.
You can buy PCI cards in this architecture from
ASUS, Diamond Multimedia, DTC, Intraserver, Lomas
Data, SW Technology, Topsys and Tyan, among others.
Each offer versions that match several variants of
the SCSI standard - unsurprisingly, as the software
drivers and PCI connector stay almost the same, and
only the integrated controller/interface chip and
SCSI sockets need to change (basically).
There are lots of chips in the 'SCSI scripts' family
and they all have a similar programming interface,
which I'm very familiar and happy with as both a user
and a programmer.
Chips in this family include the 53C710 used in the
Warp Engine, CSA Magnum and A4091 (Zorro 3 cards and
accelerators - I've written drivers down to the metal
for these, in Amiga Qdos) which are SCSI 2 FAST. These
were the fastest, lowest-overhead SCSI 2 controllers for
Amigas, and in my experience very tolerant of cable
and termination mistakes that clobber rivals. This
feature is trademarked as 'TolerANT' and involves the
controller monitoring the bounce on the I/O lines and
tuning the line drivers accordingly. It works. :-)
The Cyberstorm 3 and Cyberstorm PPC were based on
a Symbios clone of the NCR53C770 Ultra SCSI script
processor. Phase 5 abandoned the Elonex FAS216 they
had inherited from the Fastlane Z3 cards and used
in Mark 1 and 2 Cyberstorms, and so the Mark 3 was
faster and more reliable, with lower CPU overhead
than earlier models, even on 'narrow' SCSI drives.
The main snag of the Mark 3 SCSI was that it only
shipped with wide (68 pin) connections and required
expensive connectors and cables to work with cheap
and common SCSI 2 gear. It's important that we
should offer a choice of SCSI 2 FAST and 'Ultra
Wide SCSI 3' (pick your labels :-) controllers so
people can use their existing equipment (scanners,
DATs, ZIPs as well as intelligent fixed drives)
with low cost, and those who have or want later
'wide' gear can use it, without us having to
write new drivers.
The most basic PCI version of this is the 53C810, which
I use (in the SymBios remix) in my Devbox. The chip has
the same advantages but even higher integration as it
drops onto PCI rather than the CPU bus or - via glue -
to Zorro 3. However SCSI 2 FAST is a minimum, these
days; raw IDE can outrun it, though not if you have
several drives active at a time. There are ultra wide
and differential versions of the PCI chips, too.
These are some of them - there are probably others:
53C810A: Fast SCSI-2 (10 Mb/s)
53C815: Fast SCSI-2
53C825: Fast Wide SCSI-2 (20 Mb/s)
53C860: Fast-20 SCSI
53C875: Fast-20 Wide SCSI (40 Mb/s)
53C895: Ultra2 LVD (80 Mb/s)
53C896: Ultra160 (160 Mb/s, two channels)
The part numbers might have NCR, SYM or LSI prefixes. The 5 suffix
indicates support for a BIOS ROM - which can be serial or flash,
programmable in situ - this is mainly to help ignorant PCs boot from
SCSI. It's not clear if this will be needed for AmigaOne - probably
not if there's room for a bootstrap loader in the ROM that configures
PCI, but if not we should be able to put our own code in without much
trouble.
Do not confuse these with the old 5380 chips used in
old Macs (and Emplant). These were very limited in speed
and required host interventions for every SCSI phase
change. The 53C90 (in Mac 2s) could get by with about
half as much driver code as it had hardware arbitration
for common SCSI bus state changes, but it was still a
'dumb' chip reliant on interrupting the host whenever
a decision needed to be made.
Drivers for all the smart chips are very similar, and
a lot shorter and simpler than the drivers for rival
SCSI controllers, thanks to a (very) RISC 'SCSI
scripts' processor which takes all the load of
SCSI bus phase control off the main system,
and the programmer of the driver :-)
The SCSI scripts program and state machine works out
whether we are handling commands or data, resolves
contention and hard and soft errors, allowing drives
to disconnect and reconnect so they don't clog the bus
while they perform internal operations, etc... The
result is a driver that looks simple but handles all
complicated cases implicitly, and takes multi-threading
in its stride.
SCSI scripts are made of 64 or 96 bit RISC instructions
read from the host memory by DMA or from internal memory
on later chips in the series. We don't need to write
a SCSI script program, though it may be useful - NCR's
standard ones cope with all the SCSI phases and types
of transfer, and most implementations just use them and
wrap host code around it to trap completion interrupts
and set it off again.
The 53C7xx and 53C8xx parts have a relatively tiny host
overhead - between one and five per cent of that for a
second-generation 53C90 - because once you've told it
what to do the controller goes ahead and does it, coping
with disconnection and reconnection so other devices
can share the bus and the host doesn't have to wait for
seeks - a major advantage of SCSI over IDE - and scatter/
gather operations for blocks fragmented through memory,
which will be significant in implementing virtual memory.
Anyhow, arbitrarily-sized blocks of data are moved
to or from host memory by PCI DMA and the only interrupt
is at the end when the job - or a sequence of transfers
- is done, or if an error occurs in the meantime.
You don't /have/ to use DMA or SCSI scripts, though it
is most efficient - for test purposes you can treat the
controller as an entirely dumb one and peek and poke
a byte at a time, which may be useful for a minimal
bootstrap or support for sub-standard peripherals.
It can do more than just read and write the SCSI bus
- as it has bidirectional DMA you can use it as a
general-purpose block transfer device to take the
load off the CPU when moving data around main memory
or between motherboard RAM and video RAM, let's say.
It can even do horizontal and vertical scrolling and
window operations, though you'd probably want do this
using the video card local CPU and bus in practice :-)
The memory move instruction is stunningly simple, and
a good example of SCSI scripts. It's 96 bits long.
The first byte is 192 (top two bits set - these sift
between four basic RISC instruction groups). The
rest of the first (long) word is a 24 bit count of
bytes to be moved. The next two words are the 32 bit
source and destination addresses. The main limitation
is that those must have the same byte alignment, as
the chip does 32 bit transfers, and tries to collect
them in line bursts if appropriate.
We don't strictly need this, but if we were to make
our driver offer this functionality to applications
(with a hardware abstraction using the host processor
or anything else appropriate if the SCSI copro is not
present) we would be making better use of PCI and our
choice of hardware than any other non-embedded system.
Likewise our SCSI device should support the whole SCSI
spec - not just transfers between SCSI devices and
memory, but between hosts sharing a bus - no problem
as long as they have different SCSI IDs - we should
eschew cards that fix the ID at 7 as it's an avoidable
limitation - and then they can all share CD ROMs and
other peripherals - even writable drives with careful
(software) arbitration. And yes, I *know* this works,
even on the old Amiga - I've seen Linux and AmigaOS
sharing drives on a SCSI chain this way, and there's
a SCSI networking example on Aminet that uses SCSI
direct commands to make a fast parallel heterogenous
drive and computer cluster. There are other ways to
do this - Firewire, Ethernet, even USB at a pinch -
and I don't suggest that we should put effort into
implementing it ourselves - but we should specify
hardware and drivers that do not prevent it if we
or third parties see value in the concept, later.
Software issues
The whole thing should be wrapped in whatever scheme
we use for DMA device drivers, so we're not committed
to the NCR family if something else comes along and
we write fresh drivers for it. We have to support
synchronous and async I/O, and SCSI-direct (which is
the Classic Amiga scheme to allow any command to be
sent directly to any device in a host-independent
way). The existing API is fine, and hence any superset
of it would be, except that the late addition of
QuickIO - where a device call may take place in the
caller's context, without contextr switching to another
handler or device driver task, and does not return till
complete - needs to be made a core, guaranteed part of
the spec. QuickIO could have been very useful to address
complaints
about the OS getting in the way of dedicated high
performance systems like multi-tracking, but wasn't
useful in old Amiga products since not all handlers
and device drivers implemented it and Commodore defined
it as an option, not a requirement (so it was widely
ignored). This was a good idea which we should follow
through.
SCSI-direct allows custom support for new standard extensions,
non-standard or broken devices (like the NEC drives that interpret
binary parameters as BCD! 8-) by passing arbitrary SCSI commands
to a device, and marsalling the results, in a way that does not
obstruct standard uses or sharing of the SCSI bus.
SCSI-direct allows specialist applications to use some of the
SCSI features that are not available on IDE or other types of
drive. For instance a SCSI drive can be programmed to search
itself (with fields to skip and check) and call any other device
back when it has found certain data. This requires a command
with no equivalent for other types of devices, which would
have to read all the data over the bus and check it with the
main CPU. We could add this function to our API and do it the
hard way for non-SCSI devices and cleverly for SCSI ones, but
there is no need to make this a standard interface - as long
as SCSI direct is available, programs for dedicated database
or streaming applications can access the functionality without
making life more complicated for conventional applications.
It would probably be worth adding this, especially if SCSI
takes off on Amiga or other drives (e.g. over ATAPI or
firewire) offer equivalent functions, but this should not
be a priority. For the time being SCSI-direct meets the
requirement for those that understand and need it. As it
is a low-level path into code that already exists to
implement more abstract I/O operations, the cost of making
it available is tiny, and we can build on it ourselves, for
instance to extend third-party drivers in an Amiga-general way.
Another neat trick possible with SCSI-direct, as long as
you know the topology of your system in a bit more detail
than device-independence allows, is to program a drive
to copy or mirror itself to another. This can be done
without host intervention (other than reselection when
it is done) as all SCSI devices - not just the host -
can master the SCSI bus and transfers can be between
any two devices, without blocking processes of other
transfers, subject to well-defined and efficient
priority and bus sharing protocols.
Horray for SCSI! Horray for SCSI scripts processors!
[Non-text portions of this message have been removed]