There’s a rule of thumb for software development: make it, make it good, make it fast. For those unfamiliar with it, this means you should first build the core functionality of the software. Then, fix it’s bugs and make it as reliable as you can. Finally, optimize it to make it fast enough for your needs.
Retrospect 6 was a very good backup software, but its age was showing up. It was still fundamentally a Mac OS 9 app running on top of OS X Carbon API. The worst part about this was the need to launch the application in Finder, forcing you to have automatic login configured and remote-acess your backup machine via Remote Desktop or VNC, assuming it was running on a data center. Despite that, it was very reliable (I had two situations where Retrospect complained about corrupted data, and both situations were caused by faulty hardware). Also, it was not the fastest software I had seen, but it was good enough. Keep in mind Retrospect was designed when file systems had hundreds or a few thousand files on them, not a million or more like it’s normal today (I have about 1.5 million files on my laptop drive).
Facing the unavoidable, EMC decided to re-write Retrospect from the ground up as a modern OS X product, using Cocoa APIs and changing it’s architecture to a proper UNIX daemon (with a remote graphic console). Besides theoretically solving all the version 6 downfalls, they added some nice goodies, like AES 256 encryption, grooming, and some details. A nice one is that the disk backups are now stored in 100 MB files instead of a unique, giant file. This solves a lot of problems related to NAS systems.
Assuming you need backup software, you would think this were great news, right? Well, so did I. However, reality seems to be a lot worse than the perfect scenery I described above. EMC worked a lot of time on this version, and you would think they had made it, made it good and made it fast, right? Well, they decided to stop somewhere in the “Make it good” part. The problem is, rules of thumbs are nice, but common sense helps. If you launch a product that is simply too slow to be useful, people won’t use it.
Our backup machine at the university is an old G4, dual CPU (1 Ghz, I guess). Yes, it’s not exactly a screamer, but what the hell, we are talking about copying and storing files. It’s not exactly rocket science, and if the machine was up to the task when it was new, it should be up to the task now. We upgraded the RAM to a decent amount, of course.
So, last week I needed to recover the /etc directory of a colleague’s laptop to recover some apache and PHP settings that Snow Leopard installer happily overwrote. We’re talking about 3 MB of data, something that should be as simple as pressing a few buttons and get the data back. The laptop had about 750 thousand files on it, which, in my opinion, is not that many for today’s standards. So, why the hell took me about 3 hours to recover those files? Loading the catalog into the UI took almost 2 hours. Deselecting the whole file tree took almost 1 hour. The rest was the recovery process itself.
Ok, EMC. I know this stuff is optimized for Intel, and you do a lot of byte-order swapping on PowerPC. I know I’m using an old machine. But for god sake. What the hell are you doing to my CPU that needs 3 god damn hours to load a 750 000 files catalog into memory? And what’s the story with deselecting all the files? Are you telling me that, when I press the checkbox on the file root, you REALLY go trough the entire file tree and deselect each individual file? (In a rather inefficient way, because changing 750 000 booleans would take about… what? 1 millisecond on a 1 Ghz CPU?)
Well, I have at home a Dual 2 Ghz PowerPC machine that acts as backup storage, among other things. I have an AES 256 encrypted disk image (note that, at the university, I’m not using encryption, or else it would render things unbearable), served by AFP, acting as the time machine target for my laptop. When I installed Snow Leopard, I also had to recover some stuff from the /etc directory. Do you know how long it took? About 5 freaking seconds! And please, don’t tell me it’s because the G5 is faster!
Enough is enough, and 3 hours to recover a few files is ridiculous. I’ll study the possibility of migrating backups to Time Machine. Yes, space management is worse. Yes, client machines have to access the server, making the whole setup less secure than Retrospect where the server accesses the clients. But people won’t have to stop working for hours waiting for some files to be recovered.
Lesson for you, EMC: I’m not asking for you to make it fast before making it good, but at least make performance acceptable before releasing the product. Specially after taking so many years to build it.