Thoughts on testing...
 
Notifications
Clear all

Thoughts on testing tools

47 Posts
8 Users
0 Likes
2,897 Views
(@tootypeg)
Posts: 173
Estimable Member
Topic starter
 

Hi all,

What are everyone's thoughts on testing - in terms of the tools that we use. I have been thinking about strategies etc given the importance (albeit it has always been important) with ISO standards etc.

Does everyone have their own test data, strategy that they roll out on new software/releases? I have been thinking about he potential value of an automated test data generator for know good content from which to evaluate parsing/carving algorithms against. Just wanted to gather thoughts on such a thing, what it would need to do and how valuable. I was probably thinking about carving validation so test data would be geared towards such algorithms.

 
Posted : 17/05/2017 2:00 pm
minime2k9
(@minime2k9)
Posts: 481
Honorable Member
 

From what I have seen so far and the validation, which passed UKAS from both LE and private sector, isn't worth the paper it's written on. Not the fault of the producer, its that testing a tool which examines 2000 mobile phones over 4 different OS's with 20 OS versions and 1000's of app's is a job that would never be complete before it needed redoing even if a national unit was doing it.

My thinking would be that to achieve a "tick" re-use the same data again and again for carving/imaging type validation and simply say the last version got the same results as the previous version therefore it is equally good.

Its not actually guaranteeing very much, but then ISO 17025 guarantees f*** all anyway.

 
Posted : 17/05/2017 4:24 pm
(@tootypeg)
Posts: 173
Estimable Member
Topic starter
 

What about the generation of test data which is different (content-wise) but verified in terms of structure, so that more exhaustive testing could be provided.

….or are you saying that we just cant/dont sufficiently test tools.

 
Posted : 17/05/2017 5:11 pm
minime2k9
(@minime2k9)
Posts: 481
Honorable Member
 

More the latter.
In terms of imaging storage devices (hdd's, ssd's , pen drives etc) validation is possible to cover the majority of situations if you assume a toshiba spinning hdd is not different from a hitachi hdd in terms of imaging.
"Imaging" or extraction of mobile devices - any testing is the smallest percentage of use cases and the devices/os/apps have changed while you were writing up your validation.
Carving of images - its never possible to prove that a method brings back everything when you start talking about deleted, fragmented files. Only that what you get is real. As long as you get data to manually verify your results then that's basically enough. I can't imagine a scenario where a software bug causes Indecent Images to be carved from nothing!

 
Posted : 17/05/2017 5:27 pm
(@tootypeg)
Posts: 173
Estimable Member
Topic starter
 

I see your points,

I guess in terms of carving validation, it would be a case of identifying and acknowledging the weaknesses of a certain carving algorithm (if any) and that it returns results consistently when the environment variables are X.

 
Posted : 17/05/2017 6:01 pm
(@rich2005)
Posts: 535
Honorable Member
 

More the latter.
In terms of imaging storage devices (hdd's, ssd's , pen drives etc) validation is possible to cover the majority of situations if you assume a toshiba spinning hdd is not different from a hitachi hdd in terms of imaging.
"Imaging" or extraction of mobile devices - any testing is the smallest percentage of use cases and the devices/os/apps have changed while you were writing up your validation.
Carving of images - its never possible to prove that a method brings back everything when you start talking about deleted, fragmented files. Only that what you get is real. As long as you get data to manually verify your results then that's basically enough. I can't imagine a scenario where a software bug causes Indecent Images to be carved from nothing!

Been of the same opinion for a long time with regards this long running saga of testing/validation with respect ISO stuff.
What are you going to test for? What are you going to test against? As you say there's endless tools, endless functionality within said tools, endless potential data sets.
Even just picking one function, within one tool, running it against one set of data, and seeing expected results, is no guarantee it will on the next set of data. How many times, with how many different sets of data, would you want to run it, to have confidence it works as intended? Even if you ran the test 100 times over with 100 different sets of data would you be confident it was reliable on the 101th or 1001th etc?
Then multiply that for every tool, every function, and so forth?

I can never get past the opinion that it is fundamentally a giant waste of time and that you're always better off "trusting" the tool to some extent. Trying to dual-tool, or manually verify the results, every time, or at least periodically / where more appropriate, seems endlessly more practical/sensible.

In reality, the testing and validation will simply leave you with an unjustified sense of overconfidence in the tools and methods being used, or simply be a giant waste of time (or both).

 
Posted : 17/05/2017 6:49 pm
(@tootypeg)
Posts: 173
Estimable Member
Topic starter
 

Really interesting point - so is the argument then that we cant and never will be able to effectively test tools - sorry reading that back, I couldnt help but feel like I sounded like some sort of interviewer!

I hear the points, just wondered whether that is the whole point of continuing to push for testing and that every little helps. Or without any form of testing, is that not going to completely undermine our field?

 
Posted : 17/05/2017 7:18 pm
(@thefuf)
Posts: 262
Reputable Member
 

In terms of imaging storage devices (hdd's, ssd's , pen drives etc) validation is possible to cover the majority of situations if you assume a toshiba spinning hdd is not different from a hitachi hdd in terms of imaging.

This is controversial. It is easy to cover the "got input bytes from a source, wrote those bytes to a destination" case with several options like hashing, compression, HPA/DCO removal, but a real disk imaging process is much more complex.

The "post-mortem" disk imaging processes utilizing a live distribution, or a hardware imager, or a program with a hardware write blocker, or a program with a software write blocker require different tests. When testing a live distribution, results may differ whether you boot it from a CD, or a USB Flash drive, or a drive inside a USB enclosure, whether a source disk is a solid state drive or not. When testing a program with a hardware write blocker, results may differ depending on the type of the hardware write blocker (whether it is a command translator or a block device sharing box). And so on. Also, results may depend on the data itself. NIST validation reports for SMART Linux and PALADIN are a perfect example that even a proper procedure may expose a source data alteration issue in one case, and miss the same existing issue in another case, because the procedure was not designed to hunt for this specific issue.

In my opinion, many computer forensics tool validation tasks cannot be solved without reverse engineering. So, black box testing with data sets is not a solution even for disk imaging tools.

Also, examiners almost always chain different tools (for example, a disk imaging program and a write blocker), in this case you cannot simply combine validation results for both tools and make a conclusion. For example, a hardware write blocker may affect the error granularity, so a disk imaging program that does not skip the contents of sectors adjacent to an unreadable one will skip such adjacent sectors (because of a design flaw in the write blocker).

 
Posted : 17/05/2017 8:26 pm
(@athulin)
Posts: 1156
Noble Member
 

Does everyone have their own test data, strategy that they roll out on new software/releases?

Doubtful the work and time required to create and maintain good test data is not likely to be something that fits well with a normal FA's time schedule. The experience required to so … much the same, I think, but the other way around, probably.

I have been thinking about he potential value of an automated test data generator for know good content from which to evaluate parsing/carving algorithms against.

'

TL;DR Good idea. Go for it.

Almost certainly worth while, if done well. Without a 'test generator' you have very limited means of independent evaluation of a carving tool. (And we need that kind of independent evaluation.) And a reasonably good platform will probably improve as people use it, and provide feedback.

Carving can be a no-additional-information exercise where only closely related blocks of data are puzzled together, based only on what those blocks contain. That's probably easy enough. But it can also be based on the knowledge of how a particular file system behaves (say, some file contents *must* appear in the first sector of an NTFS cluster, or 'files are almost never fragmented on a ISO 9660', and if they are, the fragments come in the right order), and even incorporate additional information, such as deleted directories. Those types may (or may not) work better than the no-info type.

There's also the complexity issue extracting a fragmented file from an otherwise empty disk (i.e. just putting the fragments into the correct order) is one thing. Doing it where the 'other' blocks have random contents may be another. Doing it while 'other' blocks have similar contents but which is incomplete would probably match a real-life situation (say, multiple versions of a .DOCX file).

Then comes the content issue highly structured content, such as perhaps a PDF file, vs. highly unstructured content (basically raw ASCII).

Just wanted to gather thoughts on such a thing, what it would need to do and how valuable. I was probably thinking about carving validation so test data would be geared towards such algorithms.

Basically, I suspect that the thing to do is to let the user identify the basic files, and then just scatter their data in some well-defined over either a zeroed-disk (or disk image), or a random disk image, or something else. File-system related carving is more difficult.

Obviously, given the same input files, the same configuration file and the same target image size, the result should be reproducible.

 
Posted : 17/05/2017 8:45 pm
(@athulin)
Posts: 1156
Noble Member
 

From what I have seen so far and the validation, which passed UKAS from both LE and private sector, isn't worth the paper it's written on.

For those of us who are geographically and mobilely disadvantaged, could you clarify what document you're referring to?

 
Posted : 17/05/2017 8:47 pm
Page 1 / 5
Share: