Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^3: parsing CSV

by GrandFather (Sage)
on Oct 07, 2016 at 03:34 UTC ( #1173449=note: print w/replies, xml ) Need Help??


in reply to Re^2: parsing CSV
in thread parsing CSV

The code is pretty much the same except that the second page data gets new lines inserted in front of the id codes and we do a little clean up to remove white space at the ends of lines:

use strict; use warnings; use Text::CSV; my $page1 = <<PG1CSV; 512.45,c100 6734, c200 5653.2, c300 PG1CSV my $csv = Text::CSV->new(); my %idData; open my $pg1In, '<', \$page1; while (my $row = $csv->getline($pg1In)) { s/^\s+|\s+$//g for @$row; $idData{$row->[1]}{size} = $row->[0]; $idData{$row->[1]}{name} = '-- missing --'; } close $pg1In; my $page2 = <<PG2CSV; c100, Joe Shmo c200, Jack Black c300, Cinderella c400, Barack Obama c5 +00, Cruella Deville PG2CSV $page2 =~ s/\b(?=\w+,)/\n/g; # Insert newlines in front of id codes open my $pg2In, '<', \$page2; while (my $row = $csv->getline($pg2In)) { next if !$row->[0]; # Skip blank lines s/^\s+|\s+$//g for @$row; $idData{$row->[0]}{name} = $row->[1]; $idData{$row->[0]}{size} //= '-- missing --'; } close $pg2In; for my $id (sort keys %idData) { print "$id: $idData{$id}{name} size $idData{$id}{size}\n"; }

Prints:

c100: Joe Shmo size 512.45 c200: Jack Black size 6734 c300: Cinderella size 5653.2 c400: Barack Obama size -- missing -- c500: Cruella Deville size -- missing --
Premature optimization is the root of all job security

Replies are listed 'Best First'.
Re^4: parsing CSV
by younggrasshopper13 (Novice) on Oct 07, 2016 at 05:01 UTC
    Wow. This is amazing. Thank you. Now, how would be the best way to curl this into an email that sends this in data? I would need to curl a variable of some kind?

      I'd use MIME::Lite (despite the "Wait!" warning paragraph) especially if all you want is a text only email without attachments.

      Premature optimization is the root of all job security
        Okay, so I'd just add this to the bottom of the perl script and it would send the Standard output of the script as the contents of the email?
Re^4: parsing CSV
by younggrasshopper13 (Novice) on Oct 08, 2016 at 02:15 UTC
    Hey there, thanks a lot for the the code. I've been playing around with it trying to get it to work. I added the CSV webpages and dates. As well as changed the print to an output at the bottom. But no luck on this at all. Keep getting "Can't find string terminator "http" anywhere before EOF" and stuff like that.
    use strict; use warnings; use Text::CSV; START_DATE=$(date '+%Y-%m-%d' -d "-1 month"); END_DATE=$(date '+%Y-%m-%d'); my $page1 = <<http://url/website.com/thing?end_date=$END_DATE&start_da +te=$START_DATE&type=csv; 512.45,c100 6734, c200 5653.2, c300 PG1CSV my $csv = Text::CSV->new(); my %idData; open my $pg1In, '<', \$page1; while (my $row = $csv->getline($pg1In)) { s/^\s+|\s+$//g for @$row; $idData{$row->[1]}{size} = $row->[0]; $idData{$row->[1]}{name} = '-- missing --'; } close $pg1In; my $page2 = <<https:/url/website.com/thing?end_date=$END_DATE&start_da +te=$START_DATE&type=csv; c100, Joe Shmo c200, Jack Black c300, Cinderella c400, Barack Obama c5 +00, Cruella Deville PG2CSV $page2 =~ s/\b(?=\w+,)/\n/g; # Insert newlines in front of id codes open my $pg2In, '<', \$page2; while (my $row = $csv->getline($pg2In)) { next if !$row->[0]; # Skip blank lines s/^\s+|\s+$//g for @$row; $idData{$row->[0]}{name} = $row->[1]; $idData{$row->[0]}{size} //= '-- missing --'; } close $pg2In; for my $id (sort keys %idData) { $output .= "$id: $idData{$id}{name} size $idData{$id}{size}\n"; } curl -s -G "$output" | mail -s "send the thing for $END_DATE" name@nam +e.com
      my $page1 = <<http://url/website.com/thing?end_date=$END_DATE&start_da +te=$START_DATE&type=csv; 512.45,c100 6734, c200 5653.2, c300 PG1CSV

      What you are (incorrectly) attempting is a here document. The proper form (I'm making some assumptions about just exactly what you want) is:

      my $page1 = <<PG1CSV; http://url/website.com/thing?end_date=$END_DATE&start_date=$START_DATE +&type=csv; 512.45,c100 6734, c200 5653.2, c300 PG1CSV
      See also the discussion of here-docs in Quote and Quote-like Operators and Quote-Like Operators, both in perlop.

      Update: Do you want semicolon at the end of the
          http://url/website.com/thing?end_date...=csv;
      string | sub-string?

      Update 2: The here-doc with the label  PG2CSV is also incorrect, and in the same way.


      Give a man a fish:  <%-{-{-{-<

        This is still not working. I'm not sure what the problem is. Essentially Each webpage shows me a page of data represented in csv format. Just a single blank white page with data. Currently in our office we we run a curl command on each webpage to send that data in an email. One webpage has customer id and storage size the other webpage has those same customer id but with the customer names associated with the customer id.
        What I'm trying to do is take both of those web pages data and match their the customer id and storage size to the corresponding customer id and client name on the other put that into a new csv format and email that information to whoever needs it in the body of the email.
        In short. I need to take two web pages, match and concatenate the data and email it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1173449]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2018-12-18 17:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many stories does it take before you've heard them all?







    Results (80 votes). Check out past polls.

    Notices?