Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Split lines in file to columns

by Magnolia25 (Acolyte)
on Mar 20, 2019 at 09:35 UTC ( #1231482=perlquestion: print w/replies, xml ) Need Help??
Magnolia25 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

Below are the lines in my file. I am reading them into array and storing column values into variables.

ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription at RowCode ValuesInColumn1 DataColumnB ABC RowDescription at RowCode

The following will take data from my file, split it.That works if the columns are divided on spaces. But value for colD is not captured correctly, as string in colD has sapces in between. If value for $colB = XYZ and $colD contains substring = Region , I need to replace the $colB from XYX to N/A.

foreach my $line (@a) { my ($colA, $colB, $colC,$colD) = split( /\s+/, $line); #print "$colD \n"; }

Please help.

Replies are listed 'Best First'.
Re: Split lines in file to columns
by hippo (Canon) on Mar 20, 2019 at 09:43 UTC
    But value for colD is not captured correctly, as string in colD has sapces in between.
    my ($colA, $colB, $colC,$colD) = split( /\s+/, $line, 4);
    If value for $colB = XYZ and $colD contains substring = Region , I need to replace the $colB from XYX to N/A.

    There are many ways to do this. What did you try? How did it fail?

      Thanks hippo that worked

      If value for $colB = XYZ and $colD contains substring = Region , I need to replace the $colB from XYX to N/A

      Now I am going to try this out , was stuck at that part. So gave the reason why I am looking for this solution for earlier part

Re: Split lines in file to columns
by hdb (Monsignor) on Mar 20, 2019 at 10:01 UTC

    If all your columns are fixed-width, you can use unpack

    use strict; use warnings; while(<DATA>) { my ($colA, $colB, $colC,$colD) = unpack "A19A15A7A45"; $colC = "N/A" if $colC eq "XYZ" and $colD =~ /Region/; print "$colA -- $colB -- $colC -- $colD\n"; } __DATA__ ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription at RowCode ValuesInColumn1 DataColumnB ABC RowDescription at RowCode
Re: Split lines in file to columns
by tybalt89 (Vicar) on Mar 20, 2019 at 16:59 UTC
    #!/usr/bin/perl # https://perlmonks.org/?node_id=1231482 use strict; use warnings; while(<DATA>) { chomp; my ($colA, $colB, $colC, $colD) = split /\s{2,}/; $colD =~ /Region/ and $colC =~ s{^XYZ\z}{N/A}; use Data::Dump 'dd'; dd $colA, $colB, $colC, $colD; } __DATA__ ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription at RowCode ValuesInColumn1 DataColumnB ABC RowDescription at RowCode
Re: Split lines in file to columns
by pgmer6809 (Novice) on Mar 24, 2019 at 22:09 UTC

    Here is some code that makes use of a couple of very nice perl features.

    1) You can put your data after the _END_ statement and then have while read it in, rather than putting it in an array at the source. Makes for easier setting of test cases.

    2) It uses REGEX power of Perl to split the line. This is much more flexible than split, and is the usual way that perl programmers parse stuff.

    I have added a couple of lines to your original input to show that a) if it is not REGION but say HQ the value is not replaced. Ditto if it is REGION but the original value is not XYZ. Not sure if that is exactly what you meant but the concept should be useful.

    #!/usr/bin/perl -w while (<DATA>) { #read a line into $_ chomp; $_ =~ m/^\s*(\S+)\s+(\S+)\s+(\S+)\s+(.*)$/; # col1 col_b xyz Descr my ($col1, $data_b, $xyz, $description ) = ($1, $2, $3, $4); if ( ( $description =~ m/Region/ ) && ( $xyz eq "XYZ" ) ) { $xyz = + "N/A"; } print "Line $. = $_\n"; print "\tXYZ result = $xyz \n"; } #end while DATA exit 1; __END__ ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::Region ValuesInColumn1 DataColumnB XYZ RowDescription at RowCode ValuesInColumn1 DataColumnB ABC RowDescription at RowCode ValuesInColumn1 DataColumnB XYZ RowDescription|RowCode|Suppli +er ID::HQ ValuesInColumn1 DataColumnB BCD RowDescription|RowCode|Suppli +er ID::Region

    The result of running the above is:

    Line 1 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 2 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 3 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::Region XYZ result = N/A Line 4 = ValuesInColumn1 DataColumnB XYZ RowDescription at Ro +wCode XYZ result = XYZ Line 5 = ValuesInColumn1 DataColumnB ABC RowDescription at Ro +wCode XYZ result = ABC Line 6 = ValuesInColumn1 DataColumnB XYZ RowDescription|RowCo +de|Supplier ID::HQ XYZ result = XYZ Line 7 = ValuesInColumn1 DataColumnB BCD RowDescription|RowCo +de|Supplier ID::Region XYZ result = BCD
      You can put your data after the _END_ statement

      Normally, one would use the __DATA__ token for this purpose - see Special Literals in perldata.

      Update: Another nitpick: The special variables $1 etc. should only be used if the match succeeds. And the two lines could be shortened to: my ($col1, $data_b, $xyz, $description) = /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(.*)$/ or die "Failed to parse: $_";

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1231482]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2019-04-25 01:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I am most likely to install a new module from CPAN if:
















    Results (124 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!