Speed issue : strat...
 
Notifications
Clear all

Speed issue : strategies to cope with huge hard drives ?

5 Posts
4 Users
0 Likes
328 Views
(@zul22)
Posts: 53
Trusted Member
Topic starter
 

Hello,

I'm wondering how you guys handle the increasing size of data media, as it has a direct impact on the time required for an analysis.

My business is mainly focused on data recovery and sequential cloning of hard drives is a daily job. About two years ago, most hard drives that I had to image and then analyse were about 320 GB to 500 GB. One terrabyte drives were the larger ones.

Now, data recoveries for 500 GB and 1 TB drives are the rule, with some customers bring 2 TB or 3 TB hard drives.

With 2 TB drives and above, many data recovery sofwares cannot work directly on drives with a damaged or destroyed file system. For such drives, I often have no other solution than creating an image file. The image file itself is more than 2 TB and usually involes using 3 TB or 4 TB hard drive to store it.

Creating images as files involves slower cloning speed (because the file has to be created), sensitivity to possible file corruption, using GPT partitioning scheme, and uncompatibility with 32-bit OS.

Following the rythm of increasing drive capacities also means constant investments to maintain a stock of "up-to-date" hard drives and the customers do not necessarily realize how much does this infastructure costs.

I remember that two years ago, most data recoveries could be done the same day.
Now, there is often one day to image the drive and another one for a first analysis.
During the night, I have to save the drives in a safe place.

To speed things up, I purchased 256 GB and 512 GB SSD.
But there are already useless most of the time. I also plan aquiring a 1 TB SSD as well.
I see that there are SSD storage up to 4 TB like the Virident FlashMAX II, but it seems costing several thousands of dollars for large capacities.

I'm thinking to use compressed NTFS volumes on SSD as this would possibly allow to clone larger drives to modern SSD. This seem possible with a powerful machine, but I'm not sure how would this work with huge image files from drives >= 500 GB.

I also understand that the connectors are important, with different speeds Thunderbolt > SAS SCSI > PCI Express > SATA. Fiber optics connectors could also help.

I believe that improving hardware will not solve everything
- The transfer speed from the customer's hard drive can be the bottleneck.
- The processor can be the bottleneck.
- Many data recovery softwares don't use a log file and have higher probability
to crash during the processing of a huge amount of data (as it last longer).

I recently invested in an IBM x3400 M2 which use two Xeon processors and can accept up to 96 GB RAM DDR3 ECC, as well as hot swappable SAS hard drives. However SAS drives are still expensive too and relatively small compared to the size of the external hard drives that "home users" have nowadays.

So, I'm thinking to alternatives. I thought that cloning towards drives in RAID 1 could then help a parallel processing of data with miscellaneous data recovery softwares or allowing processing by chunks and merging results. Are there data recovery softwares well suited for some "parallel" processing?

So, which hardware or softwares (Linux/Windows) would you suggest to substantially increase performance at a reasonable price, being aware that "too cheap" solutions are often not the best ones.

Thanks sharing your thoughts, for both hardware and software.

 
Posted : 10/07/2014 7:57 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

While I am sure you are highlighting a "real world" issue ( that I have also noticed, senseless increasing of size of "common" hard disks in the hands of "average customers" (i.e. the ones that won' t have made a backup of the data) without a correspondent advancements in software tools, I understand only partially the question (in the sense that it is very "wide").

Maybe we can restrict a bit this wideness.

Which "recovery softwares" are you using or talking about?

Is the scope "filesystem recovery" or "file based recovery"?

Aren't these huge hard disk containing *always and only* NTFS filesystems?

jaclaz

 
Posted : 10/07/2014 8:54 pm
MDCR
 MDCR
(@mdcr)
Posts: 376
Reputable Member
 

Maby you could attack the problem from another perspective Instead of thinking speed, think capacity.

You have a finite number of imaging units and X jobs you wish to finish, these will hog your business if they are slow. Solution here would be to get more imaging stations or use customers computers directly with a linux boot DVD.

To increase speed, you could set up imaging over a network to a server, server could have x4 network adapters in a cluster. That way you wont see the anything but the imager as a bottleneck.

Most modern computers today has a network adapter of 1 GBit/s, a 3 TB drive = 3145728 MB divided by a 120 MB/s (Theoretical max speed for a 1 Gbit adapter). With those numbers and no other bottlenecks it would take (3145728/120/3600) = ~7h 20 mins (*) to image one 3 TB drive.

(* Not accounting for read errors/drive failures and other real world problems you may run into).

SSD will come down in price over time, but for now, i'd suggest you only use them for short term storage and a NAS for long term.

This way, even if ONE harddrive is a bottleneck, it will not hog the rest of your business, and you can run parallel jobs.

Good luck.

 
Posted : 11/07/2014 9:30 am
Chris_Ed
(@chris_ed)
Posts: 314
Reputable Member
 

Small, yet perfectly formed opinion; the Tableau TD3 is pretty fast at cloning drives. I'm doing a 500GB HDD this morning and it's going to take 3 hours total to clone & verify to another 500GB drive.

(This is a "best case" situation where both drives are connected directly to the TD3, however)

 
Posted : 11/07/2014 12:37 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

Small, yet perfectly formed opinion; the Tableau TD3 is pretty fast at cloning drives. I'm doing a 500GB HDD this morning and it's going to take 3 hours total to clone & verify to another 500GB drive.
(This is a "best case" situation where both drives are connected directly to the TD3, however)

Yep ) , but the OP was asking about disk images, not clones, and of huge 2 or 3 Tb disks.
I wonder how fast (slower) it can be when the target is on the network (see below).
And those data don't strike me as "blazing fast".

@MDCR
I may be particularly pessimist 😯 , but a 3 Tb drive in 7h 20 seems to me (I know you highlighted that this is a theoretical speed) very far from what you get in practice.

The data Chris_Ed just posted would lead (if linear/proportionate) to a nice, round, 3hx3000/500=18 h time for a 3Tb disk through a Tableau (with direct connection which is surely faster than network). ?

And, the "area" of interest of Zul22 is data recovery, so possibly the write blocking of the Tableau (or other similar write blocking device) is not-so-relevant, while it is to be expected commonly that the source is non-fully-functional/has-issues-of-some-kind.

@Zul22
Have a look at this thread also
http//www.forensicfocus.com/Forums/viewtopic/t=11704/
which revolves around topics very close to your question, particularly here
http//www.forensicfocus.com/Forums/viewtopic/p=6573259/#6573259
a link is given to the results of a nice set of test data
https://docs.google.com/spreadsheet/lv?key=0Al7os14ND-cFdGp1NDR2WGwyakR2TkJtNUFXa29pNXc&richtext=true
maintained by Eric Zimmerman, that could be useful as a reference.

jaclaz

 
Posted : 11/07/2014 1:50 pm
Share: