All mass storage devices-be they magnetic hard disks, floppies, or optical-face two primary performance constraints: access speed and transfer rate. Access speed is the inevitable delay between the instant your computer requests a particular byte or block of information from the disk drive and when that information is located on the disk. In specifications, access speed is represented by a number termed average access time, which describes the mean time (in milliseconds) required for the read/write head of a drive to move between disk tracks. Transfer rate describes the speed at which the information stored on the disk can be moved into the working memory of your PC. It is usually measured in megabytes per second. Access speed and transfer rate are issues of design, but their ultimate limits arise from mechanical issues. Because of the mechanical nature of these speed limits, any miraculous improvement in disk performance is impossible. Because laws of motion involving inertia and other principles that scientists hold dear preclude instantaneous acceleration, access delays can never be eliminated from mechanical systems. More to the point, practical mechanisms can never lower the delays to the point that some people-likely including yourself-won't be bothered by them. Similarly, the rotation rates of disks are limited by such issues as mechanical integrity (spin a disk too fast and centrifugal force will tear it apart) and the fact that the packing of data is constrained by the capability of the read/write technology to resolve individual bits on the disk surface. The big problem with these mechanical limits is that they are orders of magnitude lower than the electronic and logic limits of computers. A computer thinks in nanoseconds and microseconds but has to wait milliseconds when it needs data from a disk. The computer may need to wait longer when it needs to transfer a large block of information from a mass storage device. The best way to hurdle these mechanical barriers is with caching. A read cache attempts to keeps the next data you'll need from disk in memory so that you can access it at electronic rather than mechanical speeds. A write cache grabs data you want to write to disk at electronic speed and slowly, at mechanical speed, copies it into disk memory. All modern operating systems include caches of some kind, either integrated into the file system or as add-on programs. Caches are generally classified into two types-software and hardware-depending on the type of memory used to build the cache and where the cache is located in the system. Software caches use part of the main memory of your PC. Hardware caches use their own, dedicated memory supply. Most modern hard disks have built-in buffers that range in size from one track to a megabyte. High performance disk host adapter often incorporate hardware caches that accommodate 16MB or more. Software caches hold a performance advantage in that they work after the disk interface and expansion bus. They can operate faster than whatever performance limit is imposed by the interface and bus. Information from the hardware cache, even if instantly available, still may be throttled as it courses through the interface and expansion bus. As a practical matter, however, that difference amounts to little more than an interesting fact. Today's fast interfaces and buses make the connections with little constraint. Moreover, the two caching technologies benefit different applications, making the issue irrelevant as a practical matter. Cache OperationDisk caching systems differ in three principal ways: how they handle read operations, how (and whether) they cache disk-write operations, and their usage and management of system memory. Read BufferingThe fundamental function of a disk cache is buffering read operations. The cache software fills its memory with what data anticipates your system and software will need and supplies that information from its buffers on request at RAM speed when there is a cache hit. If there is a miss, the cache directs the software to retrieve data from disk at disk speed. The design issues control how efficient that read cache operates-and thus how often it speeds up disk operations. These issues include how the memory of the cache is filled with data, how the contents of memory are updated, and how the cache recognizes whether needed information is contained in its memory. The first issue appears simple. After all, before you can expect to read anything in a cache, there must be something there. But filling a cache is a task akin to seeing an omen; the cache-control software must make a stab at predicting your system's needs. Most caches take a straightforward approach, reading somewhat more disk than your application software requests of a given disk track or (less commonly) file. The underlying assumption is that you'll need more of what you're already looking at. After a few read requests, the cache memory will fill up, and the control software is faced with the problem of what to save and what to discard to make room for new data. Several different algorithms are used by the writers of software caches. Most are variations of Least Frequently Used (LFU) and Least Recently Used (LRU) designs, in that order of popularity. The former discards the data in the cache that your system has asked for least often. The latter throws away the data that was requested the longest time previously. The LFU technique is more complex and demands more from system resources, so LRU often performs better. Even so, in typical systems, the performance gain in using the two techniques falls within about 8 percent of one another, close enough to be a draw. Keeping track of the data in the cache is a matter of minimizing the time required to determine whether needed data is held within the cache. In general, cache programs assign tags to associate data in memory with data on the disk. The exact handling of this information, like the other details of caching algorithms used by specific products, is a matter of great secrecy among most commercial cache publishers. For the most part, publishers treat their technology as if it were black magic-but even more mysterious. Despite such proprietary miracles, most cache designers admit that the read performance among disk caching systems vary little. In cases where you can substitute or add other caching systems, you won't see a significant change in writing speed (although moving to another technology, the hardware cache discussed later in this chapter, is another matter). This inescapable fact coupled to the caches built into modern operating systems has virtually eliminated the market for third-party add-in caching software. Write BufferingWrite caching is a complex issue. Many cache designers avoid it because it is inherently dangerous. Data you expect to be written to disk is temporarily caught in the limbo of solid-state memory. If you switch off your PC before the data gets written, it may be lost even though you thought it had been saved. Early, aggressive add-in caching software was often designed to use a technique called delayed writing. The caching software held disk-bound data in memory if your system were busy, particularly if it were writing to disk. When it sensed a break-for example, you sitting there staring at the screen wondering what to do next-it immediately rushed everything to disk. The longer it waited, the longer your disk-bound data would be vulnerable to your whims and power failures. Most such add-in caches limited the maximum period to delay before writing; many let you decide how long to wait. An alternative to delayed writing is concurrent writing. Instead of waiting for a break, the cache writes your data concurrent with whatever else you do next, making it appear your write operation executed immediately and instantly. The technique is simple in concept-the cache accepts data from your application and immediately begins writing it to disk. Instead of holding back your system while the data is doled out through the drive interface and drive mechanism, you never lose control of your software. The actual writing to disk continues for some time afterward, concurrent with the continuation of the normal operation of your PC. Most modern operating systems use this approach to write caching, making the write operation a separate program thread. Although this technique might not be considered a true write cache-no special memory is reserved for the cache and no special algorithms are required-it achieves the same end. Such concurrent writing technology is not risk-free. Switching off your PC before the disk write completes still results in lost data. However, there will be no unpredictable period of vulnerability as is the case with delayed writing systems. As soon as the drive activity indicator extinguishes, you're safe to switch off or reboot your PC. If you follow the normal orderly shutdown procedure required by your operating system-for example, clicking on Shut Down before switching off your PC when you're using Windows - you won't have anything to worry about (at least in regard to disk writing). Read caches and delay writing caches all require memory of some kind to temporarily hold you disk data before it is used or written. Read caches in particularly can be heavy users of memory because of their speculative nature: they need to hold as much data as the can to better the chances that what you want will be in the cache. In general, the more memory devoted to the cache, the faster it will appear to be. Software CachesThis memory has to come from somewhere. If you rely on the cache built into your software (or an add-in software-based cache program), whatever is put to work as a buffer for the cache is stolen from your programs. A bigger cache buffer means less memory for applications. With modern operating systems that use virtual memory, with a big cache the operating system ends up shuffling more data into virtual memory (which uses disk space to emulate solid-state memory). As a result, the larger cache can slow down your system rather than speed it up. The caching software integrated into modern operating systems allocate their memory to avoid such problems. They seek to make the best compromise between the cache and application memory. Most will, however, let you tune the cache somewhat. Hardware CachesHardware caches don't take memory away from your programs because they use their own, dedicated memory. Although a lot has been written about the advantage of hardware versus software caching, the issues all boil down to one: cost. To add hardware caching to a PC means buying a caching disk host adapter and filling it with memory. During the DRAM shortage a couple of years ago, the cost of memory imposed a severe penalty. Today the price of RAM has fallen to such low levels that 16MB may be affordable. The caching host adapter may, however, cost $200 or more. Considering you get a software cache free, built into your operating system, you should expect to get a big benefit from your hardware caching investment. Whether you do depends on what your PC does. A large hardware cache definitely improves the response of a disk server that's shared among many users. Because the data requests from different users likely involve different data but may be interwoven in time, the small buffers built into most of today's hard disks don't offer much help. Each time the disk serves a different user, it will dump much of the contents of its buffer, defeating its purpose. A large hardware cache can accommodate additional disk data from multiple users and deliver it at memory speed. Not only the user getting data from the cache but the one next in line (and all those queued up for disk access) get faster response. In a single-user system, dedicated hardware caching is a more dubious choice. It will speed system operation, particularly when the operating system calls upon virtual memory. When the operating system needs to shift between applications and call on virtual memory, what it wants may be in the hardware cache. However, if the memory that's devoted to the hardware cache is instead used to augment the main memory of the PC, the operating system will have less need to use virtual memory. It will probably perform better than if it had the hardware cache.
|
|||||||||||||
Disclaimer
1) E-articles is not responsible for the information contained by this article as well for any and all copyright infringements by authors and writers. E-articles is a free information resource. If you suspect this article for any copyright infringement, please read the terms of service and contact us or use the "Report this article" button on this page to investigate the problem.
2) E-articles is not responsible for inaccuracies, falsehoods, or any other types of misinformation this article may contain and will not be liable for any loss or damage suffered by a user through the user's reliance on the information gained here. |
|||||||||||||