Cutting the Slack
(or maybe the Fat!)
Drive Partition Efficiency: Controlling Slack
Wasted space, optimization, fragmented, defragmentation and slack. Terms that not too long ago had little or no meaningful relationship to computers, let alone hard drives, or any type of drive for that matter. File fragmentation, one of the vagaries of the FAT file system, has garnered considerably more attention in recent years with the advent of hard drive sizes above 1 GB in size. When hard drives were in the range of 40MB to 1GB, no one had a clue that their drives had wasted space. As a matter of fact, most were enthralled at the release of a 1GB hard drive. It wasn’t until the mid-1990s, when hard drives began climbing above 2GB, did we realize that there was something wrong, yet we had no idea just what it was. Hard drives were not being divided into multiple partitions at that time, and after a period of continued use, users began noticing that large amounts disk space seemed to literally disappear. While on smaller drives, those below 1GB, this was barely noticeable, but on larger drives approaching 2GB, this amounted to hundreds of megabytes. As file fragmentation was more fully understood, and defragmentation tools began to develop, users finally understood the “why” behind the disappearing disk space. Shortly after the release of the FAT32 file system, the issue of fragmentation became less critical. Now, however, as hard drive sizes climb above the 80GB range, we begin revisiting the old terms such as wasted space, optimization, fragmented, defragmentation, and add a new one, “slack“. As long as hard drives remained below the 30GB range, FAT 32 tended to keep everything cleaned up, but as we passed the 40 GB mark, even FAT 32 has its problems controlling slack.
Obviously this missing drive space isn’t really gone, unless we’re talking about damaged or lost clusters, at which point this missing drive space is actually unusable portions of the hard drive that need to be recovered with the appropriate disk utility. However, we’re not discussing that aspect of missing drive space, we’re discussing slack. This slack space is simply space wasted as a result of the cluster system that FAT file system uses. A cluster is the minimum amount of space that can be assigned to a file, and no file can use merely a part or piece of a cluster and a separate file use the remainder under the FAT file system. Essentially, when a file is assigned to a cluster, even if it were merely a single byte of data, the space assigned would rounded to an integer multiple equal to the cluster size itself. If you add to that file, gradually the entire cluster would be used until you reach the maximum size of that cluster. As soon as the file becomes larger than the capacity of that single cluster, even by a single byte, the additional byte is then allocated to another cluster, and the file’s space usage will double, even though the file only increased in size by one byte.
Given that files are allocated entire clusters regardless of the file size, as drive sizes grow and along with them cluster sizes grow, the more space that will be wasted. As an example, if you had 200 files, each of which had a single byte of data occupying a cluster, the amount of wasted space, or slack, would be enormous. In essence, by doubling the cluster size of the disk, you’re doubling the amount of disk space that is wasted. The space left at the end of the last cluster allocated to the file, is commonly called slack.
Since every users situation is unique, most of the projections or examples of wasted space or slack that are provided on the Internet are presented in theoretical form. The reality, however is far worse. No, no scare tactics here, just hard facts. If files sizes were truly random, meaning that you had as many large files as you did small ones, then the problem wouldn’t be as bad. However, the reality is that most files on a system are small in size, and if you doubt that, take a look at your cache directory. A hard disk that uses more small files will result in far more space being wasted. There are a number of utilities that you can use to analyze the amount of wasted space on your drives, and one utility that is our favorite is PowerQuest’s Partition Magic. (See it in action below) Although we can only speak of the few hundred or so system disks that we have examined, in each of these cases we found that it wasn’t uncommon to find slack space equal to 20-30% of the total disk space on large drives, those at 1GB and larger. On drives in the 4-8GB range, the wasted space was much as 40-42% of the total drive size.
Let’s put all of this into perspective. Consider a hard disk volume that is using 32 kiB (32,768 bytes) clusters, and there are 15,000 files on a single partition. Let’s presume for the moment that each of those 15,000 files creates 15,000 end clusters, each of which contains slack equal to one-half its size, or 16 kiB of space per file (16,384 bytes). If you multiply the 15,000 files by the slack of 16kiB, or 16,384 bytes, you have 245.8 MiB of wasted space (240MB). Let’s take this a step further. If we were to make the further assumption that most of the files are smaller, and that the true space consumed is more like 25%, the slack jumps to an amazing 368.8MiB or 360MB. Translating this to disk sizes, if this were a 1.2 GB disk using 32 kiB clusters, the slack space would be approximately 30%. If the disk were 2.1 GB in size, the slack space would be approximately 17%. Whether you use Mebibyte (MiB) or Megabyte (MB) to define the disk size, it’s still allot of wasted space!
Obviously, you can draw the same conclusions that we do, the larger the cluster size you use, the more of the disk space you will waste due to slack. Hence, it is better to use smaller cluster sizes whenever possible. Unfortunately though, doing this is not always easy. The number of clusters that you use is limited by the design of the FAT file system, and there are performance issues to consider when using smaller cluster sizes. There are two principle methods to avoid some of these slack issues. One is to use FAT 32 as opposed to FAT 16, and the other is to use the NTFS file system. Both have their own caveats though. On very large hard drives with large partitions even FAT 32 uses extremely large cluster sizes. If you decide to use the NTFS file system, and you’re using Windows 95 or 98 or Windows ME, you’ll have to upgrade to Windows 2000 or Windows XP.
Let’s take look at PowerQuest’s PartitionMagic and you’ll see why it’s such a great tool.
Although this page was developed about the time PowerQuest was releasing version 6.0 of the Partition Magic tool, we decided to use version 5.0 as we were already familiar with it. We did learn though that little has changed between version 5.0 and 7.0, with version 7.0 available now at between $35 to $50.00US. Above you will see our slack analysis of the C:\ partition on a test drive in one of our lab computers. As you can see, it allows you to easily analyze and change the cluster size of the partitions on your hard drive. In the example above, Partition Magic indicates just how much slack there would be depending upon various cluster sizes on a given disk volume. It should be obvious that this tool puts you in control, by allowing you to decide what cluster size you would like to use. Equally important, it provides you with the minimum and maximum partition sizes allowed for the cluster size and file system in use.
Don’t become obsessive and compulsive about slack, as there will always be some wasted space on your hard drive regardless of the cluster size you choose. There is really no “set in stone” method to determine the perfect cluster size. Everyone has an opinion, however you must decide what is best for your specific circumstances. Some 4 kiB or 8 kiB to be acceptable, some consider 16kiB as the mean average, and others consider the slack of 32 kiB cluster sizes to be wholly unacceptable. The bottom line though is that cluster size needs to be determined by your specific needs, which includes available partition (or drive) sizes. Given today’s huge drives, few people even give thought to slack or watsed space.
With the cost per megabyte of drive space at or below a penny, very few people attempt to control wasted disk space any more, choosing instead to purchase bigger and bigger drives. Eventually though a toll will be paid, either in performance or thousands of small files eating up disk space (or both). Now that you have a basic idea of what slack is about, you will also be better equipped to handle the problem.