Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Find Created and Last Modified Date of Excel xls file

by gauss76 (Scribe)
on Jun 30, 2017 at 08:32 UTC ( [id://1193899]=note: print w/replies, xml ) Need Help??


in reply to Re: Find Created and Last Modified Date of Excel xls file
in thread Find Created and Last Modified Date of Excel xls file

Thanks to everyone that replied to my question. However, I don't think I have explained my problem in enough detail...Apologies.

The created and last modified dates I am after are not the dates that are given by the particular filesystem the file is on but are actually embedded in the file itself. So the suggested stat function will not retrieve the required information (I have tested this and shown that the dates reported by the stat function are not the ones I need).

I can find the required dates by opening the file in Excel and looking at its properties. These dates are NOT the same as the filesystem reports.

In general I will not have Excel installed and am therefore not able to use any vba code, for example, to retrieve the required information.

In fact I will be using a Linux based system to retrieve the info where Perl in installed.

I have already written a Perl module that can retrieve the correct information from the new Excel 2007+ file types as these are simply a set of zipped xml files.

However the older Excel files that I need to analyse are fundamentally binary files as explained in my initial post.

I hope my exact needs are clearer than before.

gauss76

  • Comment on Re^2: Find Created and Last Modified Date of Excel xls file

Replies are listed 'Best First'.
Re^3: Find Created and Last Modified Date of Excel xls file
by pryrt (Abbot) on Jun 30, 2017 at 22:11 UTC

    When I move (sftp) a spreadsheet created in Windows to a linux machine and look at the ctime using (stat)[10], it does not match the windows (stat)[10]. If I sftp the file back from the linux machine to the same location on the windows machine (even if I deleted the file in the mean time), it keeps the same (stat)[10] and the Excel Info CreateDate remains the same. If I sftp the file from the linux machine to a new location on the windows machine, the (stat)[10] changes, but the Excel Info CreateDate remains the same. So it appears that it is embedded in the .xls format somewhere, not just using filesystem information.

    I created two .xls spreadsheets with the same value in A1, a couple minutes apart, then did a hexdump and diff'd those: there were differences, but nothing jumped out and said "Create Date is encoded here", though looking at some of the strings, there are sections with ID="{...}", CMG="...", and DPB="..." that I think are likely candidates for encoded binary information. Probably someone more familiar with the .xls binary format (such as the author of Spreadsheet::ParseExcel) might be able to find it more easily. I tried things the hard way (see spoiler below): I did a key-by-key comparison of the ->parse($fn) objects, but nothing jumped out as being truly different (other than filename and format font numbering, which seem irrelevant).

    I was even able to change the Create Date in excel, using the VBA code ActiveWorkbook.BuiltinDocumentProperties("Creation Date") = Date, and re-saved. When I did that, the Excel Info CreateDate changed, and ID="{...}" section moved and changed, which is one of the reasons I think it might hold the Builtin Document Properties. (And "DPB" could be abbreviation for "Document Properties - BuiltIn". But that's just guessing.)

    Inspired by that VBA snippet, if I create a VBA function,

    function CreateDate() CreateDate = ActiveWorkbook.BuiltinDocumentProperties("Creation Da +te") end_function

    and set A1's formula to =text(CreateDate(), "yyyy-mmm-dd hh:mm:ss"), then save the .xls spreadsheet, I can use ParseExcel to grab the value of A1 and grab that CreateDate. So if you have control over the original spreadsheet creation, you can add the information you want into an accessible location; but I get the impression that you want to be able to do it for an arbitrary .xsl, not one that you control the creation of. :-)

    With no further ado, the ParseExcel compar program:

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1193899]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-19 21:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found