Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Efficient solution needed for this program

by valavanp (Curate)
on Apr 03, 2007 at 13:27 UTC ( #608056=perlquestion: print w/ replies, xml ) Need Help??
valavanp has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have an excel csv file which has 10000 records. Each record will be like:
&#x2013 &dash
Output CSV file should be:
'&#2013' => '‐'
I converted the xls file into xml. Than i used the below code to do the task:
use strict; use warnings; use XML::Twig; #open(FH, "conv.xml"); my $file="C:/conv.xml"; my $tmp; $tmp = XML::Twig->new(); $tmp->parsefile($file); my $root = $tmp->root; my $hex_value; my $entity_value; my $real_value; #while (<FH>){ open(FH, ">records.txt"); foreach my $record ($root->children('record')){ if ($record->first_child_text('hex_value') eq '') { $hex_value=''; }else{ $hex_value = $record->first_child_text('hex_value'); $hex_value=~s/^\s*//g; } if ($record->first_child_text('entity_value') eq '') { $entity_value=''; }else{ $entity_value = $record->first_child_text('entity_value'); $entity_value=~s/\s*$//g; } if ($record->first_child_text('real_value') eq '') { $real_value=$entity_value.";"; } #print "$real_value\n"; #print FH "Hexadecimal Value \t Entity Value \t Real_value"; print FH "\'$real_value\' => \'$hex_value\' +,\n"; }
I think the above code is not efficient. Can anyone suggest me efficient one to do that. Thanks all monks for your suggestion.

Comment on Efficient solution needed for this program
Select or Download Code
Re: Efficient solution needed for this program
by Corion (Pope) on Apr 03, 2007 at 13:33 UTC

    Why did you export the table to XML? Exporting it to a tab separated file is much more convenient! Then you could use the following oneliner:

    perl -a010F -lpe "$_=join qq(\t),qq('$F[0]'),'=>',qq('$F[1]')"

    or more verbosely:

    #!/usr/bin/perl -w use strict; while (<>) { chomp; my @columns = split /\t/; printf "'%s'\t'=>\t'%s'\n", @columns; };
Re: Efficient solution needed for this program
by Limbic~Region (Chancellor) on Apr 03, 2007 at 13:36 UTC
    valavanp,
    What do you mean it isn't efficient? Perhaps you should look up the definition of the word. I will go out on a limb and guess that you want it to run faster. In any case, converting the CSV was not necessary.
    $!/usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $file = $ARGV[0] or die "Usage: $0 <input_file>"; open(my $fh, '<', $file) or die "Unable to open '$file' for reading: $ +!"; my $csv = Text::CSV_XS->new; while (<$fh>) { chomp; if ($csv->parse($_)) { printf "'%s'\t'=>\t'%s'\n", $csv->fields; } else { print STDERR "parse() failed for '$_': ", $csv->error_input; exit; } }

    Cheers - L~R

    Shamelessly stole Corion's printf() since it was better.
Re: Efficient solution needed for this program
by friedo (Prior) on Apr 03, 2007 at 16:21 UTC

    It's been my experience that the words "XML" and "efficient" rarely appear together, unless someone happens to be saying "XML is not terribly efficient."

    In fact, I would venture to say that the only legitimate purpose of XML parsers is for parsing data given to you by someone obnoxious enough to think XML is a good idea at all.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://608056]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2014-07-25 06:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (169 votes), past polls