Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Bulletpoints in spreadsheet being converted to double quotes

by viffer (Beadle)
on Apr 02, 2014 at 01:52 UTC ( #1080669=perlquestion: print w/replies, xml ) Need Help??

viffer has asked for the wisdom of the Perl Monks concerning the following question:

G'day all. I've run out of fingers (and patience) using google..
you're my last hope.
Probably should have been my first.. I'm reading a spreadsheet that contains bullet points in the cell.
For some reason the bullet points are being converted to speech quotation marks. Is there any way I can maintain the integrity of the data so that when I print the cell contents the bullet points are still there?
I push the spreadsheet cells into an array @arr, but when I look at the contents of @arr the bullet points are now speech marks.

Any help would be greatly appreciated.

use Spreadsheet::ParseExcel; my $spreadsheet_csv; open ( $spreadsheet_csv, '<', $LOOKUP_FILE ); my $fname = $LOOKUP_FILE; my $excel = Spreadsheet::ParseExcel::Workbook->Parse($fname); my @arr = (); my $sheet = $excel->{Worksheet}[0]; foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) { foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) { my $cell = $sheet->{Cells}[$row][$col]; if (exists $cell->{Val}) { push @arr,$cell->{Val}; } else { $cell->{Val} = ''; push @arr,$cell->{Val}; } } }

Replies are listed 'Best First'.
Re: Bulletpoints in spreadsheet being converted to double quotes
by kcott (Bishop) on Apr 02, 2014 at 02:58 UTC

    G'day viffer,

    Some additional information would be helpful.

    What is the format of your spreadsheet:

    • .xls: Use of Spreadsheet::ParseExcel suggests this; however, your code doesn't look like the documented code for that module. For example, you have $cell->{Val} but the documented equivalent would appear to be $cell->value().
    • .xlsx: If so, Spreadsheet::XLSX would be a better choice of module. Curiously, your code does look like the documented code for this module, e.g. the documentation shows $cell->{Val} which mirrors your posted code.
    • .csv: Seems unlikely but $spreadsheet_csv in your code looks dubious.

    It's somewhat unclear exactly what conversion is taking place. Is your '• content' being converted to '" content', '"content"', or something else. A link to a minimal sample of your input file, as well as showing us (a representative part of) the contents of @arr, would be useful.

    The documentation for Spreadsheet::ParseExcel would suggest that $cell->unformatted() might be appropriate; however, the source code for that method shows:

    sub unformatted { my $self = shift; return $self->{Val}; }

    It's entirely possible that by ignoring the published interface, and circumventing the OO encapsulation, that you've caused some problem(s) which may be hard to track down.

    Anyway, once you've supplied the information requested above, we may have a better idea.

    -- Ken

Re: Bulletpoints in spreadsheet being converted to double quotes
by graff (Chancellor) on Apr 03, 2014 at 02:15 UTC
    I'm guessing that your xls spreadsheet file contains some Unicode characters to convey the bulletpoints. You might want to check out this script I posted here a while back: xls2tsv.

    It turns out that Unicode characters in Excel spreadsheets are encoded using UTF-16BE (who knows why…), so once you decode them into utf8, it should look as expected.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1080669]
Approved by boftx
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2022-05-22 14:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (80 votes). Check out past polls.

    Notices?