Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Concatenating arrays fetched from different text files

by thanos1983 (Vicar)
on May 28, 2014 at 23:21 UTC ( #1087738=perlquestion: print w/replies, xml ) Need Help??
thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am not sure if the title can explain or at least describe partially what I am trying to implement, so I will wright a short explanation.

I have created a script that reads files as argument and push the result in an array. Sample of code will be provided a bit further on.

My target is to provide several file names as argument (e.g. > 2), where I will be able to store them on separate arrays in order to concatenate them after.

Sketch of desired output stored in text file:

array-1[0] array-2[0]...array-n[0] array-1[1] array-2[1]...array-n[1] . . ... . . . ... . . . ... . array-1[n] array-2[n]...array-n[n]

With the current configuration of my code, I am able to read the file given as argument, parse it into an array, choose the element of the file that I am interested and then write the output of the file.

For one file as input works just fine. The script is able to read several inputs and execute the process for different text files accordingly.

The problem is that I am trying to concatenate two arrays in one simultaneously. What so I mean, each file will have n rows which I need to process and create different arrays so I could concatenate them after on the desired result.

I was thinking if I had different arrays I could apply:

array-1 . ' ' . array-2 . ' ' . array-n

Or something close to that solution, I have not implemented that before so I need to figure it out how to do it for each array element.

Well I was thinking about creating several subroutines with different array names and then at the end concatenating the arrays. But, this is not a "good" coding solution, not to mention that my solution will not be generic, it will always require modifications in case of additional text.txt file additions.

In conclusion, I want to believe that it can be done in a single subroutine by somehow modifying the name of the array and as a second step concatenating the arrays together. Or maybe I am thinking crazy here.

Sample of running code:

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; $| = 1; my @timestamp; my @doc_read; my @result; my $write = 'write.txt'; my $line; my $text; my $arg; my $date = localtime(); sub read { foreach $arg (@_) { open (READ, "<" , $arg) or die ("Could not open: ".$arg." - $!\n"); while ( @doc_read = <READ> ) { chomp @doc_read; foreach $line (@doc_read) { @result = split (':', $line); push (@timestamp, $result[3]); } } close (READ) or die ("Could not close: ".$arg." - $!\n"); } return @timestamp; } sub write { open (WRITE , ">>" , $write ) or die ("Could not open: ".$write." - $!\n"); print WRITE "\n" . $date . "\n"; foreach $_ (@_) { print WRITE $_ . "\n"; } close (WRITE) or die ("Could not close: ".$write." - $!\n"); my $text = "Successfully writen on ".$write.".\n"; return ($text); } my @values = &read(@ARGV); print Dumper (\@values); my $final = &write(@values); print "\n" . $final . "\n";
Update on the sample text files. (Adding extra length of the array).

Sample of text-1.txt files:

Line_1:Line_1_1:Line_1_2:Line_1_3:Line_1_4 Line_2:Line_2_1:Line_2_2:Line_2_3:Line_2_4 Line_3:Line_3_1:Line_3_2:Line_3_3:Line_3_4 Line_4:Line_4_1:Line_4_2:Line_4_3:Line_4_4

Sample of text-2.txt files:

Line_5:Line_5_1:Line_5_2:Line_5_3:Line_5_4 Line_6:Line_6_1:Line_6_2:Line_6_3:Line_6_4 Line_7:Line_7_1:Line_7_2:Line_7_3:Line_7_4 Line_8:Line_8_1:Line_8_2:Line_8_3:Line_8_4
Update, explanation of the 4th element.

Apologies for the confusion that I caused. I was not ware that by applying perlre can solve my problem. My initial plant was to extract the 4th element of each line on each file.

The way that I could imagine of solving it, was to read each line separately, use the split function to remove the : (colon) and store them into an array. In order to extract the 4th element of the array (information) that I want to use.

This is how I thought that could be a possible solution.

Sample of output with one file as argument:

$VAR1 = [ 'Line_1_3', 'Line_2_3', 'Line_3_3', 'Line_4_3' ];

Sample of output with two files as argument:

$VAR1 = [ 'Line_1_3', 'Line_2_3', 'Line_3_3', 'Line_4_3', 'Line_5_3', 'Line_6_3', 'Line_7_3', 'Line_8_3' ];

Thank you all for your time and effort reading and replying to my question.

Update: 2

I have found a solution to my problem, but I know that it is not generic and not well coded. But for the moment it meets my expectations, any improvement would be much appreciated.

The code provided underneath must to be loaded with the test *.txt files mentioned above.

#!/usr/bin/perl use strict; use warnings; use Data::Dumper qw(Dumper); use constant ARGUMENTS => scalar 2; $| = 1; my @timestamp_first; my @timestamp_second; my $write = 'output.txt'; sub first { open (FIRST, "<" , $ARGV[0]) or die ("Could not open: ".$ARGV[0]." - $!\n"); while ( my @first_read = <FIRST> ) { chomp @first_read; foreach $_ (@first_read) { my @result_first = split (':', $_); if (/^\s*$/) { # /^\s*$/ check for "blank" lines may contain s +paces or tabs next; } push (@timestamp_first, $result_first[3]); } } close (FIRST) or die ("Could not close: ".$ARGV[0]." - $!\n"); return @timestamp_first; } sub second { open (SECOND, "<" , $ARGV[1]) or die ("Could not open: ".$ARGV[1]." - $!\n"); while ( my @second_read = <SECOND> ) { chomp @second_read; foreach $_ (@second_read) { my @result_second = split (':', $_); if (/^\s*$/) { # /^\s*$/ check for "blank" lines may contain s +paces or tabs next; } push (@timestamp_second, $result_second[3]); } } close (SECOND) or die ("Could not close: ".$ARGV[1]." - $!\n"); return @timestamp_second; } sub write { open (WRITE , ">>" , $write ) or die ("Could not open: ".$write." - $!\n"); foreach $_ (@_) { print WRITE $_ . "\n"; } close (WRITE) or die ("Could not close: ".$write." - $!\n"); my $text = "Successfully writen on ".$write.".\n"; return ($text); } sub check { if (@ARGV < ARGUMENTS || @ARGV > ARGUMENTS) { die "Please enter ".ARGUMENTS." files to read!\n"; } } &check(@ARGV); my @first_values = &first($ARGV[0]); my @second_values = &second($ARGV[1]); my @final; my $num = @first_values; for($_ = 0; $_ < $num; $_++) { push (@final , $first_values[$_] . ' ' . $second_values[$_]); } print Dumper (\@final); my $date = time(); my $output = &write(@final); print "\nResult: ".$output."";

Output:

$VAR1 = [ 'Line_1_3 Line_5_3', 'Line_2_3 Line_6_3', 'Line_3_3 Line_7_3', 'Line_4_3 Line_8_3' ]; Result: Successfully writen on output.txt.
Seeking for Perl wisdom...on the process...not there...yet!

Replies are listed 'Best First'.
Re: Concatenating arrays fetched from different text files
by Cristoforo (Curate) on May 29, 2014 at 02:45 UTC
    From the sketch of desired output stored in text file that you have shown, it looks like you want to transpose the rows and columns so that for all the files, index 0 is on the first line, index 1 is the second line (rather than column).

    I tried the code below to get those results.

    #!/usr/bin/perl use strict; use warnings; my @value; my $i = 0; while (<>) { # read from @ARGV chomp; push @{ $value[$i++] }, /:([^:]+)/; # get timestamp $i = 0 if eof; } my $write = 'write.txt'; open my $out , ">" , $write or die "Could not open: $write - $!\n"; print $out localtime . "\n"; print $out "@$_\n" for @value; close $out or die "Could not close: $write - $!\n"; print "Successfully written on $write.\n";
    Hope this might be what you were looking for.
      This will work under the assumption that all input files have the same number of lines. However, if you have for example

      Sample of text-1.txt files:

      Line_1:Line_1_1 Line_2:Line_2_2 Line_3:Line_3_3 Line_4:Line_4_4
      Sample of text-2.txt files:
      Line_5:Line_5_5 Line_6:Line_6_6
      Sample of text-3.txt files:
      Line_7:Line_7_7 Line_8:Line_8_8 Line_9:Line_9_9 Line_10:Line_10_10
      the array would have the content 'Line_9_9' and 'Line_10_10' in the second column (where one would expect content from the second file only). This is because push simply appends at the end of the array.

      To avoid that it would be better to use fixed indexing instead of push:

      So

      my $i = 0; while (<>) { # read from @ARGV chomp; push @{ $value[$i++] }, /:([^:]+)/; # get timestamp $i = 0 if eof; }
      could be replaced by this
      my $i = 0; my $filecount = 0; while (<>) { # read from @ARGV chomp; $value[$i++]->[$filecount] = /:([^:]+)/; # get timestamp if (eof) { $i = 0; ++$filecount; } }

      This would put the timestamps always in the right places, but you would have to be prepared to have 'holes' in the array, if the input files have different number of lines.

      You can test if an array cell has a value assigned with code like the following:

      if (!exists $value[$line]->[$file]) { # no value assigned, there is a hole in the array } else { # value assigned }
      Hope this helps! Update: forgot a closing brace in the new code block. It is now fixed.

        To: hexcoder,

        More I read more I get impressed. Thank you for the tip it is always nice to learn possible problems that can be avoided.

        Thank you for your time and effort reading and replying to my question.

        Seeking for Perl wisdom...on the process...not there...yet!

      To: Cristoforo,

      Perfect, this is exactly what I need. You solved it in 20 lines and my code is 50-60 lines and does not even solve the problem. I was even thinking of making my code more complicated to solve the problem. Well the only thing that I need to figure out how to extract the 4th element of each array (e.g. array3). This is the location of the item that I need to extract.

      Thanks a lot for your time and effort, to assist me.

      Seeking for Perl wisdom...on the process...not there...yet!
        What do you mean by the 4th element of each array? From everything you said previously, you seemed to be willing to use all elements of the arrays. Or perhaps you mean that you need to use the 4th element of each line of the input files? Your requirement is not clear to me.
Re: Concatenating arrays fetched from different text files
by 2teez (Vicar) on May 29, 2014 at 00:20 UTC
    Hi thanos1983
    I think basically, what you are trying to do is to concatenate several files, with little manipulations here and there, going by the sample of files as input and output at the end of your post. If that is right, then what I will advice is put the for loop to loop over the files outside the subroutine to read the files. By that you can get the names of the files to read from one after another and use your subroutine to read and write.
    Something to give a head up:
    use warnings; use strict; reader($_) for (@ARGV); sub reader { my ($filename) = @_; open my $fout, '>>', 'output_file.txt' or die "can't open file: $! +"; open my $fh, '<', $filename or die "can't open file: $! +"; while (<$fh>) { print $fout $_; } print $/; }
    Of course, you have to pass all the files to read from, from the CLI.
    Like: $ perl read_n_write.pl file1.txt file2.txt file3.txt ...
    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

      To: 2teez,

      This is what I was thinking of doing too. I will have a go and see what happens. Thank you for your time and effort replying to my question.

      Seeking for Perl wisdom...on the process...not there...yet!
Re: Concatenating arrays fetched from different text files
by boftx (Deacon) on May 29, 2014 at 00:42 UTC

    Speaking in general terms, you probably want to be passing references to your arrays to a generic sub-routine, not actual arrays, for starters. Update: just took a second look and saw that you are doing that already.

    Beyond that, you probably want to read up on splice, shift and unshift (and of course push and pop depending on just how you want to manipulate the arrays. A good reading of the docs for splice will show that you can implement the other four functions using splice alone, but that wouldn't be as clear.

    You might also think of your data stacks as being piles of cards, there could be some useful insights or ideas there.

    It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.

      To: boftx,

      So many functions that I am not even aware of, that is the beauty of Perl. I will take a look and try to experiment with all of them. I need to become more familiar you never know where they might needed.

      Thank you for your time and effort replying to my question.

      Seeking for Perl wisdom...on the process...not there...yet!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1087738]
Approved by taint
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2018-07-16 01:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (330 votes). Check out past polls.

    Notices?