Session 15: I/O devices in Minix

Textbook: Chapter 3

I/O device definitions

control hierarchy
block v character devices
memory-mapped I/O v I/O ports
DMA vs direct communication
nonpreemptive resources and deadlock

Accessing a Winchester disk
layout
getting parameters
making a request
receiving a response

We're moving into Chapter 3 now.

Chapter 3 covers a huge portion of Minix - the part that deals with communicating with I/O devices. Honestly, this portion is pretty mundane - lots of details about how particular I/O devices work in Minix. We'll be moving through these details pretty quickly.

Today we'll look at general facts about I/O devices, and then we'll look at how a disk drive might work, by looking at one particular interface, an AT Winchester disk drive.

I/O device definitions

Control hierarchy

Transferring information between a user process and a hardware device is a complex, multistep process.

user process
server
device driver
interrupt handler
bus transfer
controller chip
physical medium
A user process will call a device-independent server, which manages the access of things in a device-independent way. This server might check permissions, or work out from the information exactly which device the process is looking for. The server sends a message to the device driver, which translates information into the protocol actually implemented in the hardware. It sends this onto the bus that connects the CPU and all its devices. A controller chip on the hardware device receives this information, and it uses it to direct the actual hardware to do what is requested.

When the actual hardware has information to send back, it gives it to the controller chip, which places it onto the bus, which triggers an interrupt in the system to be handled by the OS interrupt handler. The OS interrupt handler sends the information onto the relevant device driver, which translates the information into a device-independent representation for the server to handle.

Block v character devices

One important division line between devices separates the block devices from the character devices. Block devices work in whole blocks at a time, and they are ``random access'' - that is, the computer can access any portion of the data at any time. Disk drives are a good example of a block device.

A character device works instead with a constant stream of characters that must be accessed one at a time. A keyboard is a good example of a character device, as the device won't allow you to read an arbitrary character in the sequence at any time: It will only give you the next character of the stream. Network cards are also character devices, as are mice and printers.

Displays don't fit neatly into this scheme. Older displays - the character-oriented displays - are classical character devices. Newer graphical displays tend to be more block-oriented. You don't tend to read large blocks at a time, but they're random acccess, so the block device matches them better.

Nonpreemptive resources and deadlock

Another important distinction is whether a device is preemptible. One process preempts another's use of a resource when it uses the same resource itself while the other process still wants to use it in the future. This is a sensible thing to do with a disk, since switching to a different location on a disk isn't horrible.

Some resources are nonpreemptive however. For example, the printer shouldn't be preempted while a process is in the process of printing a document. You don't want to get pages of one document interspersed with pages of another document. CD-ROM recorders and tape drives are two more examples of devices where the process that uses it really needs exclusive access to it.

Nonpreemptive resources can lead to deadlock situations, where the needs of different processes lead to a situation where none of the processes can complete. For example, say a process that has control of the printer decides it also needs the tape drive, and a process that has control of the tape drive needs the printer to continue. Neither can continue, and so we have a deadlock situation.

This is a relatively rare situation in practice, so operating systems don't worry about it. But you should know that it does pop up.

We've already seen Minix handling deadlock in one simple situation - if process A wants to send a message to process B, and process B wants to send a message to process A, Minix will detect the situation and refuse one of the process's requests, so that they don't get stuck.

But Minix doesn't tackle the problem of deadlock between nonpreemptive devices. The known techniques for doing this involves onerous requirements, and it doesn't pop up often, so Minix doesn't check the problem.

By the way, if you take Databases next semester, you'll be hearing a lot more about deadlock, as it's a bigger problem in databases.

Memory-mapped I/O v I/O ports

There are two major ways that CPUs might use to communicate with devices. One is memory-mapped I/O, where to send a message to a device, the message stores the data at a special memory address, which gets forwarded onto the device. An alternative technique is the I/O port, a place where the message sends data to a device directly. The port numbering scheme is independent of the device numbering scheme.

Memory-mapped I/O can take advantage of the quick communication with memory, and it doesn't require additional CPU instructions. However, ports are conceptually simpler, and they put the numbering system into a different numbering space from the memory sequence. The tradeoff is pretty even, and some designers have gone one way, and some go another. In general, Intel went for I/O ports.

DMA vs direct communication

Another distinction from how CPUs might choose to communicate with devices is direct memory access (DMA) versus direct communication. For example, if the CPU used direct communication, and the computer wanted to read from the disk, it would read a byte at a time directly from the controller chip. With DMA, however, it would tell the controller chip to copy a sequence of bytes into memory, and the controller chip would handle this copying process itself.

The reason for DMA is that it releases the CPU of the responsibility of copying the data to memory (since the data read from disk is eventually going into memory anyway), so it doesn't have to waste instructions doing the copying itself. This means that it can spend its instructions doing something else. Additionally, it's a little simpler to not have to include those instructions as part of the OS.

The reason for direct communication is that it's potentially more efficient. In a I/O-oriented system, the delay for the controller chip to copy the memory is slow relative to the amount of time the CPU would take, since the CPU is a heavily optimized processor, whereas the controller chip is typically a very weak processor. Also, direct communication removes some responsibility from the controller chip, so that it can be that much simpler.

In multiprocess high-performance environments, DMA makes more sense. In low-cost or single-process environments, direct communication makes more sense. IBM PCs come with DMA capability, but it's not consistently used, since the CPU is generally much faster and most PC OSes are built for a single user.

Accessing a Winchester disk

The Winchester disk defines a relatively simple but realistic interface between the CPU and a hard drive. A hard drive is a block device. In the case of the Winchester interface, it uses the port interface, and it uses direct communication rather than DMA. I want to look at its specific protocol from a hardware point of view, and later we can look at the source code of Minix that works with a Winchester disk.

Layout

A hard drive consists of a set of platters, each side divided into concentric circles called tracks, with each track divided into arcs called sectors. A ``modern'' Winchester drive would have two platters (four sides), with 1048 tracks per side, and 252 sectors per track. Each sector has 512 bytes. The total size of such a disk would be 4*1048*252*512=515.8MB. (It would be advertized as 540.8MB, however, because disk manufacturers always exaggerate a disk size by interpreting a megabyte as 106 bytes, not 220 bytes.)

Actually, the above is simplifying matters. That's how the computer would see it, but in fact the disk is simplifying messy details, like how there are differing number of sectors on different tracks - in order to make the most out of the outermost tracks.

Reading something from the disk is a three-step process:

  1. the heads move to the proper track
  2. the drive waits until the proper segment is beneath it
  3. the correct head is enabled, and the magnetic signals it receives are interpreted
Of these, the slowest step by far is the first step - moving the heads to reach the proper track. Because of this, people tend to talk about cylinders - a vertical slice through the platters containing tracks of the same diameter. So the first move moves to the proper cylinder (this is the seek time), the next move moves to the proper sector of that cylinder (the rotational delay), and the final move lasts as long as it takes for the sector to move under the head (the data transfer time).

Getting parameters

For the OS to use the CPU, it needs to understand the device. At boot time, the OS will acquire some information from the drive. This includes the number of cylinders, heads, and sectors. It also includes a configuration number called the ``precompensation,'' which I don't understand but which is nonetheless required.

The OS actually reads this from the BIOS, the ROM located at a fixed location in memory. The BIOS gets it from the user configuration.

Making a request

The disk has a number of ports available with which a program can communicate directly with the disk. I'll number them beginning at 0, but actually they begin at 0x1F0 for one disk and 0x170 for another. So actually Minix supports two hard disks, and when I talk about port 2 of a disk, I'm actually talking about either 0x1F2 or 0x172 depending on which disk.

When it's time to make a request, the OS sends several pieces of data to the disk. All of these are 1-byte quantities sent using the outb instruction.

port 1: the precompensation number
port 2: the number of sectors wanted
port 3: the sector at which to start
port 4: the low byte of the cylinder number from which to read
port 5: the high byte of this cylinder number
port 6: some configuration flags and the head number
Finally, the OS sends to port 7 of the hard drive a command. The two most important commands are operation 32 (for read) and operation 48 (for write).

Receiving a response

At this point the disk goes off and does its job. This takes quite a while. (Think about it: a 10,000 RPM disk rotates once every 6 milliseconds, and on average we're going to have to wait for it to rotate halfway, for a total of 3 milliseconds. A 1000-MHz computer can do roughly a billion instructions a second, so 3 milliseconds represents 3 million instructions that can occur while the disk is rotating into place. And this doesn't even account for the seek time, which is much larger! (A seek time might be 7 ms on average.))

Industrial-strength operating systems will block the process at this point so it can take advantage of those 50,000 available instructions. Minix, however, simply busy-waits until it receives an interrupt.

After the interrupt is received, the disk is ready to receive or send bytes. The OS sends or receives information a word at a time via port 0. Each successive access refers to the next word in the sector.