Re^7: raw data formatting

by Kenosis (Priest)
on Nov 16, 2012 at 14:15 UTC

in reply to Re^6: raw data formatting
in thread raw data formatting

A grep on the data to check for four spaces followed by a digit at the beginning of each line works:

use strict; use warnings; use List::MoreUtils qw/natatime/; my $it = natatime 5, grep /\A\s{4}\d/, <DATA>; while ( chomp( my @lines = $it->() ) ) { my $letter = 'A'; my $acctNum = do { $lines[0] =~ /\s+(\d+)\s+(\d+)/; $1 . $2 }; push @lines, " acctnum=$acctNum"; print for map { s/\s+/$letter++ . ' '/e; "$_\n" } @lines; } __DATA__ Place your data here...

Replies are listed 'Best First'.
Re^8: raw data formatting
by Anonymous Monk on Nov 16, 2012 at 14:18 UTC
    Hey, thanks for posting the real data again, good luck notifying the owners of this security breach

      I don't know that you're correct about "posting the real data again". The re-posted data contained text like "ADAMS ,xxxxxxx," so it clearly appeared that all sensitive info had been redacted--after I requested that it be scrubbed.

      Have removed all data, just in case...

        There is no S/S, nor any CCN's in the original file and all the data was scrubbed. Yes I double-checked all the postings. It is scrubbed!
Re^8: raw data formatting
by teamassociated (Sexton) on Nov 16, 2012 at 15:07 UTC
    The solution was two fold: Kenosis provided this solution, thnk U! 1. I needed to take out the /r in b/c my Perl is 5.8.8
    print for map { s/\s+/$letter++ . ' '/er} @lines; print for map { s/\s+/$letter++ . ' '/e; "$_\n" } @lines;

    2. I replaced
    my $it = natatime 5, <DATA>; with my $it = natatime 5, grep /\A\s{4}\d/, <DATA>;

