±Forensic Focus Partners

Become an advertising partner

±Your Account


Forgotten password/username?

Site Members:

New Today: 0 Overall: 34614
New Yesterday: 0 Visitors: 184

±Follow Forensic Focus

Forensic Focus Facebook PageForensic Focus on TwitterForensic Focus LinkedIn GroupForensic Focus YouTube Channel

RSS feeds: News Forums Articles

±Latest Articles

±Latest Webinars

Distributed Processing

Computer forensics discussion. Please ensure that your post is not better suited to one of the forums below (if it is, please post it there instead!)
Reply to topicReply to topic Printer Friendly Page
Forum FAQSearchView unanswered posts

Distributed Processing

Post Posted: Tue Nov 14, 2017 4:07 am

Just a few questions on distributing the processing of cases for efficiency. Are there any papers/resources looking at the challenges of doing this, I have had a bit of a look but have not found anything looking at the practicalities and challenges of doing this - just wondered if anyone knew of any.

Also, just wondered on peoples thoughts on applying such principles in a case  

Senior Member

Re: Distributed Processing

Post Posted: Tue Nov 14, 2017 6:29 am

Some points and questions you could look into:

* Security / Integrity
- How do you protect the data when it's "out there"? (i.e. Cloud processing)
- How do you maintain/prove data integrity? (Chain of custody for those of you in LE)
- Can you actually test that the system is doing what it is supposed to?

* Infrastructure / redundancy / task scheduling
- What if a node go down, how do you retrieve the results? (Latency detection of nodes)
- How many resources (nodes) do you need and how much data throughput is there? (is it worth it?)
- Will you gain any performance if the infra cannot keep up? (Is it even possible?)

* Centralised storage for results
- How are the results gathered? (Even negative results is a result)
- How are they incorporated into a final result?

You will find similar questions/problems in the field of distributed computing, i suggest you look there (it's a fun topic).  

Senior Member

Re: Distributed Processing

Post Posted: Fri Nov 17, 2017 7:23 am

Not sure if I'm missing the point of this, but I've been using Distributed Processing in live cases on and off for years without really encountering any of the theoretical challenges posited here. A lot of it is pretty common sense stuff - you need roughly equally matched machines, ideally connected with fibre instead of copper, and they need to be reliable and well set up. Just like what you need for regular digital forensics, only more so. The results / efficiencies are more than worth it.  

Senior Member

Re: Distributed Processing

Post Posted: Wed Dec 13, 2017 6:04 am

For Toothypeg #1 www.fcluster.org.uk/re...ersion.pdf is my PhD dissertation finished in April 2015. "Information Assurance in a Distributed Forensic Cluster " addresses the issue of tracking data as it is acquired, distributed, stored and then processed across a large scaled infrastructure. As at April 2015, I'd like to think that its it isn't in the dissertation, it didn't exist. Of course, the world moves on.

The main issue in my dissertation is that the current practice (in 2015) of 'imaging' either a drive or files(s) and having an associated checksum becomes unsupportable in available distributed architectures. I proposed a solution by designing and building my own virtual distributed files system that I called Fcluster.

For Redcat: I'm interested in what you're using. I do not believe FTP Lab version is a true distributed system. I assess FTK on page 100 and found that although it is described as distributed it is really not, in the sense that Hadoop etc are.

Has the world moved on since 2015? I'm no longer directly involved in forensics but am still interested.  


Re: Distributed Processing

Post Posted: Wed Dec 13, 2017 6:06 pm

- tootypeg
Just a few questions on distributing the processing of cases for efficiency.

It isn't clear whether this is about 'processing' of the case or data (distributed computing). Distributed data processing of individual tasks usually reduces efficiency in exchange for processing speed. It gets efficient, if there's a consolidation/load smoothing effect by running a large number of tasks on a large infrastructure.  

Senior Member

Page 1 of 1