Notifications
Clear all

Excel hashing

7 Posts
4 Users
0 Likes
775 Views
(@rabidfox)
Posts: 4
New Member
Topic starter
 

So i did some hash test files in excel. I made 4 one was an exact copy same hash, one I renamed in windows also same hash value and one i renamed using the save as feature and it displayed a different hash value. So I was wondering if anyone can explain why this has happened?

 
Posted : 01/12/2018 1:32 am
tracedf
(@tracedf)
Posts: 169
Estimable Member
 

If you opened the file in Excel and chose "save as", the metadata was probably was updated. Even if the change was not visible to you, something changed.

 
Posted : 01/12/2018 3:31 am
(@athulin)
Posts: 1156
Noble Member
 

So i did some hash test files in excel. I made 4 one was an exact copy same hash, one I renamed in windows also same hash value and one i renamed using the save as feature and it displayed a different hash value. So I was wondering if anyone can explain why this has happened?

The best way to figure that out is, usually, to compare the files, byte by byte. Easiest way is probably to do

C\Users\Whoever> COMP book1.xlsx book2.xlsx

and examine the output. You'll get a list of places where the two files differ. If they do … hashes will differ as well, of course.

As xlsx files are zip archives, you can unpack them, and compare the contents. Or, open both in 7zip and check the CRC column. I expect that only the docProps folders will show different CRC data. If you want to find exactly where the difference is located, just go on from there.

 
Posted : 01/12/2018 6:40 am
(@randomaccess)
Posts: 385
Reputable Member
 

Athulin, the second way you suggested is probably going to yield more useful results. As you pointed out, the xlsx format is a zip, so I think the first one might show that they're different, but the data will still be compressed.

If you unzipped both documents and then hashed the individual components youd probably see the difference quickly; my guess is internal metadata stored in docprops is what's changed (which was also suggested by athulin)

 
Posted : 01/12/2018 11:30 am
(@athulin)
Posts: 1156
Noble Member
 

Athulin, the second way you suggested is probably going to yield more useful results. As you pointed out, the xlsx format is a zip, so I think the first one might show that they're different, but the data will still be compressed.

It may yield results, but in this particular case, I think the only useful result is if the OP begins to understand what's going on.

 
Posted : 01/12/2018 7:04 pm
(@rabidfox)
Posts: 4
New Member
Topic starter
 

forgot you could extract them so the two files within excel that had changed were core.xml and workbook.xml.
in core the meta-data is physically stored so modified time affect that and in workbook there is a unique document ID that changes.
thanks for the help guys

 
Posted : 01/12/2018 8:32 pm
(@randomaccess)
Posts: 385
Reputable Member
 

It may yield results, but in this particular case, I think the only useful result is if the OP begins to understand what's going on.

looks like that's happened

Good work OP, digging into the weeds of the file format is always a good place to start when trying to understand how this whole crazy world fits together

 
Posted : 02/12/2018 7:20 am
Share: