Hard Disk Drives
A hard disk drive (HDD) is a device for storing programs and data on a non-volatile storage medium. The term non-volatile means that the information stored is not lost when the power to the computer is turned off, as is the case with the computer's random access memory (RAM).
In a typical desktop computer, the hard disk drive is used to store the operating system, application programs and user data. All of this information is stored on disk as a collection of electronic files. The very first hard disk drive was developed by IBM in the mid 1950s and consisted of an assembly comprising fifty 24" diameter disks with a total storage capacity of 4.4 megabytes. It weighed over one ton, and cost in the order of $50,000.
It was not until almost three decades later that a firm called Shugart Technology (who subsequently changed their company name to Seagate Technology) began manufacturing hard disk drives for personal computers. Their first offering, released in 1980, was the ST-506 (see below). It had a storage capacity of 5 megabytes and a data transfer rate of 625 kilobytes (five million bits) per second. The only similarity with IBM's original hard disk drive was its storage capacity. The ST-506 was designed to fit into the same type of drive bay as a 5.25" floppy disk drive, weighed only five pounds, and was initially priced at around $1,500.
The Seagate ST-506 hard disk drive
Today, the hard disk drive is still the primary non-volatile storage device to be found inside a personal computer. Thanks to advances in technology and improved manufacturing techniques however, both the size and weight of the hard disk drive have been considerably reduced. Data transfer speeds and storage capacity have increased dramatically, while the price of hard disk storage space has been reduced from $300 per megabyte to several thousand megabytes per dollar.
At the time of writing, the hard disk drives typically found in personal computers have capacities measured in hundreds of gigabytes, and drives with capacities of several terabytes are now available. As of 2010, nearly all hard disk drives are manufactured by (in alphabetical order) Hitachi, Samsung, Seagate, Toshiba or Western Digital.
Hard drive technology
The main component of the hard disk drive (as its name suggests) is a rigid non-magnetic disk usually referred to as a platter. Most hard disk drives actually have two or more platters, mounted on a single rotating spindle. The disks have in the past been constructed using a light aluminium alloy, but more recently a mixture of glass and ceramic material has been used to produce disks which are thinner and more resistant to heat.
The disks are machined to achieve a smooth, highly polished finish. Each side of the disk is then coated with a thin layer of magnetic material (usually a cobalt-based alloy). The magnetic surface of the disk is populated with billions of tiny magnetic regions known as domains, which are initially created by a strong localised magnetic field generated by the drive's write heads as they pass over the disk.
Each domain acts as a dipole, i.e. it has its own magnetic field which acts in one of two possible directions, depending on the polarity of the magnetic field used to create it. Each domain stores exactly one bit (binary digit) of data. The value of the bit will be one or zero, depending on the orientation of the domain's magnetic field.
Each of the drive's read heads will contain a detector made of a material that has magneto-resistive properties. When data is read from the disk, the magnetic field stored in each domain will cause the resistance of the detector in the read head to vary in accordance with the field's polarity. The bit value stored in each domain is interpreted as a binary one or a binary zero, depending on the way in which the resistance of the detector is affected by its magnetic field.
Each surface of every disk in a hard disk drive is organised into tracks and sectors during a process known as a low-level format, which occurs at the factory. The tracks form concentric circles that radiate outwards from the spindle in the centre to the edge of the disk.
Track density varies from one hard drive to another, but modern hard drives typically have track densities measured in tens of thousands of tracks per inch. A sector is a subdivision of a track that can hold a fixed amount of data (this is usually set at 512 bytes per sector). The diagram below shows a typical arrangement of tracks and sectors (sometimes referred to as the disk's geometry).
Tracks and sectors on the surface of a platter
The platters are mounted on a central spindle one above the other so that they all rotate together at a uniform speed. Tracks that occupy the same track position on both sides of the platters form what is known as a cylinder. The concept of a cylinder is important because the read/write heads move together over the same position on each surface of each platter. Read and write operations can be carried out more quickly if units of data that belong together are written to sectors and tracks that belong to the same cylinder, because the lateral movement of the read/write heads can be kept to a minimum.
As you can see from the somewhat simplistic diagram above, the sectors nearest to the centre of the disk would need to have a higher number of domains per unit length than those nearest to the outer edge of the disk in order to contain the same amount of data. This is not particularly practical since, assuming that the rotational speed of the disks is a constant, the read/write heads would have to read and write data much more quickly as they moved over the inner tracks. The solution is zoning, in which tracks nearer to the centre of the disk contain fewer sectors per track.
When the disk drive is powered up, the disks are rotated on their spindle by a motor at a constant speed of several thousand rpm (to put that in context, the average speed at which the surface of the disk moves past the read/write heads can exceed two hundred miles per hour). There is a read/write head for both sides of every platter, each mounted on the tip of a separate actuator arm, although the actuator arms are linked, and move across the disk together.
Once the disk is spinning, each read/write head is positioned less than a tenth of a millimetre away from the surface of the platter to which it is assigned. Because of this proximity, any mechanical shock experienced by the disk during operation can cause the head to make physical contact with the surface of the disk (what is known as a head crash), damaging the magnetic coating and destroying data in the process. Care should therefore always be exercised when installing or handling a hard disk drive.
Damage can also occur if dust or small particles of dirt somehow get into the casing. If one of these particles is caught between the heads and the surface of the disk, the magnetic coating will be damaged. Consequently, the internal drive components are housed inside a sealed metal casing. The only opening in the casing is a small breather hole to prevent the build up of air pressure inside the case, which has a filter to prevent the ingress of dust and other particles.
When the integrated drive controller receives a read or write request from the operating system, it operates an actuator that positions the read/write heads above the designated track on the disk by rotating the actuator arms through a precise angle. The heads are moved across the surface of the disk and into position at very high speed. Because the disk is spinning at several thousand rpm, the sector to be read from (or written to) will pass under the read/write head within a few milliseconds.
This ability to access any memory location on the disk almost instantaneously defines the hard disk drive as a random access device, unlike the serially accessed magnetic tape storage medium used on early mainframe systems. In tape storage systems, the tape must be wound forwards or backwards to the correct location before the data can be read. A typical disk drive actuator arm is illustrated below.
The actuator arm moves laterally across the surface of the disk
The read/write heads are mounted on a small assembly at the tip of the actuator arm called a slider. The actuator arm itself is constructed from spring steel that pushes the slider and its attached read/write heads down onto the surface of the disk when the disk is stationary. Once the disk reaches its operational speed of rotation, air is forced under the slider causing it to "lift off" in much the same way as an aircraft does when it reaches take-off speed. The read/write heads thus glide across the surface of the disk on a thin cushion of air.
For this reason, the air pressure inside the head disk assembly enclosure must be maintained within specific limits. If the air pressure falls below a certain level, there is a real danger that the heads will make physical contact with the surface of the disk, causing a head crash (for this reason, hard disk drives that are intended for use at very high altitudes require a specially designed sealed enclosure).
When the computer is powered down, the read/write heads are moved over an area of the disk on which no data is stored so that they can safely be brought to rest on the surface of the disk. This contact start/stop (CSS) zone, or landing zone as it is often called, is usually located near the centre of the disk. If power is interrupted unexpectedly, a spring mechanism or rotational inertia will ensure that the heads come to rest safely in the landing zone.
Because of the very high speeds at which the disks rotate and the large number of tracks per inch on the surface of the disk, it would be almost impossible for the drive's control circuitry to pinpoint the exact location of a particular track or sector on the surface of a disk. Information is therefore written to the disk during the low-level formatting process that can be used to orient the read/write heads during read or write operations.
When the disk drive's controller receives a read request from the computer's operating system, it translates the logical address provided into a physical address for the data in terms of a track number and sector number. It would be extremely inefficient to read just one bit or byte of data at a time, so data is read from the disk in blocks of a pre-determined size, starting at the physical address given. The data being read from the block is stored in a buffer until all of the data has been read.
Because a computer file typically occupies a large number of sectors on a disk, the data to be read is often located in contiguous sectors. Consequently, most operating systems impose a logical structure on the disk during the initial high level formatting process that organises the disk into groups of sectors called clusters. The size of each cluster depends on the operating system, and can sometimes be determined by the user during the installation of operating system software.
Whatever the size of a cluster however, each cluster can only store data from a single file and must occupy contiguous sectors on the disk. Many modern hard drive controllers have their own internal cache that allows them to store multiple consecutive sectors from the same track or cylinder when a read request is received. Disk access speeds can often be improved as a consequence, since clusters belonging to the same file tend to be stored in contiguous locations on the disk, and there is a high probability that consecutive read requests will refer to consecutive disk addresses.
Addressing and disk capacity
The manner in which a sector is addressed has undergone some changes since the early days of disk drives. At one time, the location of a specific sector was referenced using its cylinder number, head number and sector number (this addressing scheme is often abbreviated to CHS). Indeed, the total number of sectors on the drive could be calculated by multiplying the number of cylinders by the number of read/write heads, and then multiplying the result by the number of sectors per track.
Since the introduction of zoned bit recording (as mentioned above, this is a drive geometry in which the number of sectors per track is smaller at the centre of the disk) this calculation can no longer be used. The way in which sectors are addressed has also become more abstract, relieving the operating system software of the need to know about physical drive geometry. Note that sectors that are logically sequential are not necessarily physically contiguous.
After reading a sector, there may be a small delay before the drive controller is ready to read another sector. Sectors that are logically sequential may therefore be spaced at discrete intervals on the disk to give the drive controller time to get ready to read the next sector - a technique known as interleaving. If an interleave factor of 3:1 were used for example, it would take three full rotations for the controller to read all of the sectors on a single track. Thanks to advances in technology, most modern hard drives do not need to use interleaving.
Modern hard drives use logical block addressing (LBA), a simple linear addressing scheme in which each sector is given an integer index number, starting with 0. The drive controller translates each logical block address into a cylinder, head and sector number in order to obtain the physical location of the sector on disk. The maximum number of sectors that can be addressed is dependent on the number of bits used for the logical block address - this is currently 48 bits, giving a maximum capacity of 128 pebibytes (a pebibytes is 1024 terabytes) based on a sector size of 512 bytes.
Note that hard drive manufacturers tend to quote disk capacities using denary units (for example, one megabytes = 1,000 kilobytes) as opposed to the binary units used by many operating systems (where one megabyte = 1,024 kilobytes), so the capacity reported by the operating system is typically only 90-95% of the capacity claimed by the manufacturer.
Like other electro-mechanical devices, disk drives eventually fail. Sometimes the failure is signalled by degraded performance, which may be accompanied by the drive becoming noisy or producing strange sounds. In other cases, the first indication is that the drive completely refuses to "spin up". It is therefore a sensible precaution to ensure that all data is regularly backed up. Less serious errors involve so-called "bad sectors" in which a very small area on the surface of the disk has become unusable.
Modern disk drives can detect these damaged areas and transparently re-map the logical sectors that occupy them to an unused area of the disk (most modern disk drives have unused areas that have been set aside for this purpose). The inclusion of error correction codes when writing data to disk requires that some additional data must be stored, but any data stored in a sector that is subsequently found to be bad can (usually) be recovered intact as a result.
Many current drives employ Self-Monitoring, Analysis and Reporting Technology (SMART) that keeps track of the occurrence of remapping due to bad sectors in an attempt to predict hard drive failure.
The performance of a hard disk drive (as opposed to its capacity) is essentially a measure of how quickly data can be transferred from the disk to memory during a read operation, or from memory to disk during a write operation. By far the biggest delay factors in either type of operation are created by the electro-mechanical aspects of the device. The rotational speed of the platters determines how quickly the heads can begin to read or write data once they are over the correct track. The average time taken for the disk to rotate to the required point is called the rotational latency, and will be dependent on the speed at which the disk rotates.
Most hard drives currently installed in desktop computers run at 5400 or 7200 rpm, although speeds of up to 15000 rpm have been achieved in some high performance drives. There is frequently a trade-off between speed and capacity however, since one way to increase the rotational speed of the platter is to decrease its diameter (this reduces the drag on the platter due to friction between it and the air passing over it, which in turn results in less heat being generated by the drive).
There will also be a delay (called the seek time) while the read/write heads are moved into position over the track to be read from or written to. The total time that elapses between a request for data being received and the data being available from the drive is known as the access time, and includes the seek time and the rotational latency (both measured in milliseconds), and any time required for processing by the disk controller.
The overall speed at which data can be transferred to or from the drive is called the disk transfer rate. It is measured in tens of megabytes per second and will vary, depending on where on the disk the data is read from or written to (disk transfer rates will be faster for tracks near the outer diameter of the disk than for those near the centre of the disk).
One more metric to consider is the data transfer rate, which is the speed at which data can be transferred across the electronic interface between the drive controller and the computer's I/O controller hub. This will vary from one type of interface to the next but is currently measured in hundreds of megabytes per second.
In order to increase disk transfer times without further increases in either rotation speed (which would be accompanied by increased heat and vibration) or the surface area of the platters (requiring correspondingly larger drives), the way forward would appear to be to increase the bit density of the data on the surface of the disk. Small improvements in seek times (currently just a few milliseconds) will also make a minor contribution. Currently however, the capacity of disk drives is increasing at a faster rate than their performance. One consequence of this is that as capacity increases, so does the time required for backing up the contents of the drive.
Historically, the shape and size of hard drives has been tailored to take advantage of the floppy disk drive bays built in to most computers from the IBM 5150 onwards. Today, floppy disk drives have virtually disappeared but their influence on hard disk drive form factor nomenclature endures. Common form factors are generally described in terms of their floppy disk drive counterparts. Currently, for example, most desktop personal computers have a hard disk drive with a "3.5 inch" form factor, even though the actual size of the platters is closer to 3.74 inches.
A number of smaller form factors have emerged for disk drives that are designed for use with laptop computers, netbooks and a host of other mobile computing applications. The most widely used of these is the "2.5 inch" form factor, which has platters of 2.5 inches (or less) in diameter. As you would expect, the larger form factor drives tend to have the larger capacity.
A 3.5 inch Western Digital 3 TB SATA disk drive
A 2.5 inch Western Digital 500 GB SATA disk drive
The drive interface
The drive interface used defines the characteristics of the electronic interface between the disk drive and the computer. The type of interface used will to a great extent depend on the purpose for which the computer is to be used, and the type of interface(s) supported by the system motherboard. A number of different interfaces have been developed over the years, some of which are described below.
Advanced Technology Attachment (ATA)
Introduced in 1986, ATA has in the past been somewhat incorrectly referred to as Integrated Drive Electronics (IDE) and has been retrospectively renamed to Parallel ATA (PATA) to distinguish it from the more recent Serial ATA (SATA) interface. This interface was the one most widely used in desktop computers up until SATA appeared in 2003.
The use of the popular IDE misnomer comes from the fact that this interface was the first in widespread use to have the drive controller built into the drive itself. Previously, the drive controller was a separate add-on card that occupied one of the ISA slots on the computer's motherboard. The drive was connected to the motherboard using a 40 or 80-conductor ribbon cable that connected a 40-pin socket on the drive itself to a similar socket on the motherboard (see below) and transferred sixteen bits of data in parallel. Each ribbon cable could connect two ATA drives in a master-slave configuration.
Enhanced IDE, introduced by Western Digital in 1994 in anticipation of changes to the ATA standard (embodied in the ATA-2 specification introduced in 1996), allowed the use of direct memory access (DMA) which meant that data could be transferred directly between the disk and memory without involving the CPU in the data transfer process. This freed up the CPU for other tasks.
Rear view of an IDE/ATA hard disk drive
An IDE/ATA socket on a motherboard
A typical IDE/ATA ribbon cable
Small Computer System Interface (SCSI)
SCSI disk and tape drives were standard fare on servers and high-performance workstations from the early 1980s (the SCSI interface was standardised in 1986) until around the mid-1990s, and despite advances in ATA technology can still be found in many high-performance server applications. SCSI can be used to connect a wide range of devices, and the SCSI standard defines command sets for many specific types of peripheral device.
The SCSI interface allows a maximum of either 8 or 16 peripheral devices to connect to the host computer via a shared parallel bus. Servers typically employ RAID drives in which multiple disks are connected to a SCSI RAID controller card via a SCSI backplane inside a disk enclosure. The connection between the backplane and the controller card will typically be a 68 or 80-conductor single drop ribbon cable. Multiple non-RAID devices could also be connected to a SCSI controller card using multi-drop cables.
SCSI drives have not been widely adopted for personal computers due to their cost, and the availability of relatively inexpensive ATA drives that provide perfectly adequate performance for most desktop computing environments. SCSI controller cards are nonetheless still available for personal computers, and can be mounted in a standard PCI-X or PCI-E expansion slot. Parallel SCSI has largely been superseded in server and mass storage applications by Fibre Channel (FC) or Serially Attached SCSI (SAS), both of which use a high-speed serial interface.
An IBM 4.5GB 68 Pin Ultra 160 SCSI hard disk drive
A SCSI PCI-X storage controller card
A single-drop 68-conductor SCSI ribbon cable
Serial Advanced Technology Attachment (SATA)
Introduced in 2003, SATA is the successor to Parallel ATA (PATA). One of the most obvious differences is the use of a high-speed serial signal cable instead of the parallel ribbon cable used for ATA drives. The cable itself can be up to a metre in length. It has two pairs of wires for carrying data and 3 ground wires, giving a total of seven wires. The cable is cheaper and less bulky than its PATA counterpart, allowing a better flow of air within the system case and making it easier to install.
A SATA signal cable connects a single drive to a SATA socket on the motherboard - there is no master/slave arrangement. SATA drives use a 15-pin power connector rather than the 4-pin Molex power connectors used for PATA drives, although adapters are available to enable a SATA drive to be connected to a power supply via a 4-pin Molex power cable should the need arise.
The first version of the SATA standard is officially designated as Serial ATA International Organization: Serial ATA Revision 1.0 (the technology itself should be referred to as SATA 1.5 Gb/s) and specifies a gross transfer rate of 1.5 gigabits per second. Taking encoding into account, this equates to 1.2 gigabytes (150 megabytes) of data.
Subsequent revisions have doubled and redoubled the transfer rates. Revision 2.0 (SATA 3.0 Gb/s) is capable of a gross transfer rate of 3.0 gigabits per second, and Revision 3.0 (SATA 6.0 Gb/s) has a gross transfer rate of 6.0 gigabits per second. As of 2010, most installed hard drives and PC chipsets implement SATA 3.0 Gb/s, although SATA 6.0 Gb/s products are now becoming available (the Version 3.0 standard was released in May 2010).
Most motherboards produced since 2003 have integrated SATA controllers (although an add-on controller card can be installed in a PCI or PCI-E slot). The SATA controller can use the Advanced Host Controller Interface (AHCI) in order to take advantage of advanced features such as the hot-swapping of drives, providing both the motherboard and operating system support AHCI. If not, SATA controllers are capable of operating in "IDE emulation" mode.
A Seagate 1.5 TB SATA hard disk drive
Close-up of SATA slots on a motherboard
A SATA signal cable
External hard drives
External hard disk drives are generally standard ATA, SCSI or SATA hard disk drives mounted in a suitable portable disk enclosure. The drive can be connected to a computer via a USB or Firewire port, or in the case of SATA drives via an eSATA (external SATA) or eSATAp (power over eSATA) interface. If an eSATA or eSATAp port is not available on the system, one can usually be added using a PCI add-on card.
The use of an eSATA interface has the advantage that data transfer rates are generally faster than for contemporary versions of either USB or Firewire. Having said that, a future iteration of Firewire is predicted to be able to achieve a data transfer rate of 6.4 Gb/s, which will be slightly faster than the SATA 6.0 Gb/s version of eSATA, while USB 3.0 will not be far behind with a data transfer rate of 4.8 Gb/s. Unlike USB or Firewire however, eSATA allows low-level drive features such as SMART to be available to the drive.
Unlike Firewire, neither USB 2.0 nor eSATA are capable of providing the 12V power supply required by some 3.5" external hard disk drives (such as the 1TB Seagate external drive pictured below), which means they need a separate power supply. The introduction of eSATAp is intended to resolve this issue, while USB 3.0 will reportedly be able to provide voltages of 5V, 12V or 24V. At the time of writing, the storage capacity of a typical external hard drive can range from a few hundred gigabytes up to 4 terabytes.
A Seagate 1TB external hard drive