New disk, large clusters (1 Viewer)

zeroaccess

Active member
Local time
Today, 16:32
Joined
Jan 30, 2020
Messages
671
From 2009 onward I have only bought solid state disks for my computers. Fast forward to today and I bought my first mechanical hard disk in 12 years. Why?

Because as much as I love them, it's for detachable backup and the value proposition for SSD's isn't there for this use case.

This disk will exclusively host backup archives, currently .tib files from Acronis. This writes a single, full backup file of your machine (unless you tell it to split them up), which is currently ~ 1.5 TB but set to grow. I then do weekly incremental backups 9 times, when the 10th backup is another full backup, wiping the previous incremental files. At any rate all files are very large.

When formatting the disk, I decided to spring for the new 2 MB NTFS cluster size as I'm not aware of any downsides. Unless I'm misunderstanding, the backup software will create a compressed archive in the form of a long, continuous write to the disk which is where you can benefit from a large cluster size. If disaster recovery is ever needed, I would be looking at a long, continuous read of the same.

Has anyone else worked with large cluster sizes? Allocation sizes beyond 64kb are fairly new, introduced in Windows 10 1709.

Shown is the disk resting on my case while formatting (a very long operation). I eventually moved it over the top case fan which reduced temperatures from 42°C to 32°C.
 

Attachments

  • PXL_20210225_063554140.NIGHT.jpg
    PXL_20210225_063554140.NIGHT.jpg
    823.2 KB · Views: 246
Last edited:

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 16:32
Joined
Feb 28, 2001
Messages
27,001
Adjusting cluster sizes is perfectly legal but does require a certain level of dedication. We had variable cluster sizes for our disks on the super-mini "mainframe" that I ran for the Navy. I can offer a few limited guidelines.

IF you are going to use the disk ONLY for large files, plan to keep the directory structure VERY simple - because ALL files INCLUDING FOLDERS will use the same allocation unit size. If you need to keep a mixed bag of file sizes on such a disk, consider ONLY keeping the mixed bag as a ZIP archive because in that case, the container will use the allocation unit but the actual files inside the ZIP container will use another format for storage and retrieval.

For VERY large files (such as your terabyte backup container), you will probably get pretty good efficiency. The expectation value for slack space in a 2MB allocation unit is 50% of the last allocated cluster, or 1 MB. For a 1.5 TB file, that comes to 1 MB / 1.5 TB, which is 1/1,500,000, fairly efficient. Since you are using this for backup only, you will have control over it pretty well.

When I was running my Navy machines, I couldn't use the larger cluster sizes. We were spinning something over 1 TB on a system with ORACLE databases, and less than half of the disks I had could be dedicated to ORACLE and their large .DBF container files. The others had to be writeable to the general users (as private directory space) so could not have super-large clustering. Can't begin to tell you how many users INSIST on creating files containing about three or four lines of text in a 64K cluster size. Terrible slack space ratios.

Long-term, you will need to consider something else. The first couple of times you will be OK without it, but you MIGHT consider defragging your disk after you erase the incremental backups. Eventually, those backups are going to fragment, which tends to negate any efficiency of usage due to file contiguity. Big cluster + huge file = fragmented mess. But for the incremental backups, you probably don't need as much contiguity.

To be honest, a defrag on a really large file with really large allocation units isn't usually necessary because you know that (in your case) you have a contiguous fragment of 2 MB, and the overhead of the seek-arm motion once every 2 MB is almost nothing. If you are writing this backup from an SSD, the latency will be fairly low, too, so you should get decent performance. However, if you are intent on maximizing speed, a once-weekly defrag might be in order.
 

Users who are viewing this thread

Top Bottom