in reply to Re^2: Spreadsheet::XLSX why is Val != _Value
in thread Spreadsheet::XLSX why is Val != _Value

if the cell contains the text "0.153", why would "value" and "unformatted" differ by 0.003?

Have you ever used Excel? The value in the cell and what is displayed in the cell do not have to match. If you have the contents 0.153 and use cell formatting to display in two-decimal fixed-point as 0.15, the difference between the underlying value and the displayed value will be 0.003 -- that's the way Excel works. If you were to run the same experiment, but set the cell format to "blah" 0.00 "blah", you would find that the real value was 0.153, but the displayed value was the string blah 0.15 blah. If you want the actual data from the cell, use the unformatted method; if you want to know how Excel last rendered that cell, use the value method; if you want to run the risk of the underlying implementation of Spreadsheet::XLSX changing on you, use the hash key Val to currently be equivalent to the unformatted method or the underlying Excel data, and use the hash key _Value to currently be equivalent to running the value method (with no guarantee that in future versions those same hash keys will exist, or that it will even be a hash-based object). Using the underlying hash of an object (if it's currently implemented as a hash) is a bad idea, unless the official documentation tells you to access the hash elements directly -- and even then, using the defined accessor methods is a better idea (unless the author doesn't provide accessors).

"The documented interface" - can tell me where this is? The metaCPAN page is minimal.
In the docs for Spreadsheet::XLSX, it says,
This module is a (quick and dirty) emulation of Spreadsheet::ParseExcel for Excel 2007 (.xlsx) file format. It supports styles and many of Excel's quirks, but not all. It populates the classes from Spreadsheet::ParseExcel for interoperability; including Workbook, Worksheet, and Cell.
... which strongly implies that it has the same interface as Spreadsheet::ParseExcel. That module documents that the value() method and the unformatted() method work as described in this thread.

Replies are listed 'Best First'.
Re^4: Spreadsheet::XLSX why is Val != _Value
by Tux (Canon) on Jul 01, 2021 at 11:20 UTC
      TYVM good idear!
Re^4: Spreadsheet::XLSX why is Val != _Value
by misterperl (Monk) on Jul 01, 2021 at 12:37 UTC
    TY- Ill ask the Microsoft team to look into the cell formatting..

    "Have you ever used Excel?" as a SOAP guy - I admit as little as posible!

      Ill ask the Microsoft team to look into the cell formatting..

      If that's the conclusion you drew after reading my post, I didn't explain myself in a way you can understand.

      I am sure that the "Microsoft team" has formatted the cells to look like what the end user wants to see the data, to hide precision that the end user doesn't need to see, or what have you. Formatting cells happens in every practical spreadsheet.

      The problem is that you need to figure out whether your Perl program needs to use the underlying data or the displayed value, then use the accessor method that is documented to get the appropriate one:

      • If you are trying to access the real, underlying data value that is actually stored in the spreadsheet cell, then you need to use the module's method for accessing the unformatted value -- in this case, the cell method named unformatted() (example: $cell->unformatted()).
      • If you are trying to grab what an Excel user would see in the cell (ie, that may be displayed as fixed precision, or might be displayed with extra text wrapped around it, or with currency symbols, or with commas for thousands separators, or displaying decimals as percentages, or what have you), then use the cell method named value() (example: $cell->value()).
      • It shouldn't bother you that the two read back different things: they are meant to read back different things, and you are supposed to pick the method that grabs what you need for your project.
      • Never use the $cell->{_Value} or $cell->{Val} hash entries. The modules don't advertise those as the right way to grab the data, and it is not guaranteed to work the same in future versions of the module. The methods mentioned above are the only correct ways to access the actual data or the displayed data from the spreadsheet.

      If you are going to be doing any calculations with the data in Perl, then you probably need to use the unformatted() method. Only if you want to make a report of some sort that uses the same formatting as the Excel spreadsheet should you use the formatted display that you get from the value() method.

      If you still don't understand or don't know which you really need, then you probably need to have a talk with whoever defined the requirements for your Perl project, and clarify the needs of whether you should be working with the underlying data or whether you should be working with what the Excel spreadsheet displays. If you aren't understanding the difference between the cell's underlying data and value that Excel displays, ask the "Microsoft team" to show you where the 0.153 and 0.15 are inside Excel, and ask meaningful questions of them if you aren't understanding the difference, or why there needs to be a difference.