Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Doubts group data

by leoberbert (Novice)
on May 04, 2017 at 01:07 UTC ( [id://1189470]=perlquestion: print w/replies, xml ) Need Help??

leoberbert has asked for the wisdom of the Perl Monks concerning the following question:

Dear All, I need a help with the following problem. I have a file with the following data.
21997|||70049,,20170428154818,20170527235959||| 21997|||70070,,20170428154739,20170527235959||| 21998|||70049,,20170428154818,20170527235959||| 21998|||70070,,20170428154739,20170527235959||| 21998|||70071,,20170428154739,20170527235959|||
I need to unify the file as follows.
21997|||70049,,20170502172844,20170531235959; 70070,,20170502172844,20 +170531235959||| 21998|||70049,,20170502172844,20170531235959; 70070,,20170502172844,20 +170531235959; 70071,,20170502172844|||

Replies are listed 'Best First'.
Re: Doubts group data
by GrandFather (Saint) on May 04, 2017 at 04:23 UTC

    What have you tried and where did you run into trouble?

    Can you describe in words what needs to be done? Perhaps listing a series of steps that you would follow if you were doing the process by hand would help?

    Premature optimization is the root of all job security
Re: Doubts group data
by kcott (Archbishop) on May 04, 2017 at 07:41 UTC

    G'day leoberbert,

    " I need a help ..."

    It would appear you actually need lots of help, not a help. Here's lots of help:

    • "How do I post a question effectively?" (surprisingly, as you joined almost four years ago and have made several posts in that period).
    • "How (Not) To Ask A Question" (paying particular attention to the "Do Your Own Work" section).
    • "perlintro — Perl introduction for beginners" (this is peppered with links to more detailed and advanced information: follow as needed).
    • Text::CSV with "sep_char => '|'", if your data is in a pipe-separated format; split with "/[|]{3}/", if "|||" is just a constant literal used as a delimiter.
    • You should probably consider capturing all your input in a hash; leaving collation and output to a subsequent stage. The key would appear to be that initial, numeric field.
    • You may also want join and sort in various places.

    — Ken

Re: Doubts group data
by hippo (Bishop) on May 04, 2017 at 08:26 UTC
    I need a help with the following problem.

    You have described a task but not where you think there is a problem. Perhaps the problem is with your algorithm - but since you have not described your algorithm nobody can say. Perhaps the problem is with your code - but since you have not shown your code nobody can say. Perhaps the problem is with your test suite - but since you have not shown your tests nobody can say. Perhaps the problem is with your operating system or your environment or your hardware limitations or ...

    As you are not new here, leoberbert, you must have seen how good questions elicit good answers. Ask a good question, brother. Wisdom may flow.

Re: Doubts group data
by davido (Cardinal) on May 04, 2017 at 04:27 UTC

    What part of the problem involves Perl?


    Dave

Re: Doubts group data
by NetWallah (Canon) on May 04, 2017 at 04:30 UTC
    $ perl -an '-F/\|\|\|/' -e 'm/\d/ or do{print qq<$k|||$v|||\n> ;$k= +$v="";next};$k=$F[0];$v.=$F[1].qq|;|' data2.txt 21997|||70049,,20170428154818,20170527235959;70070,,20170428154739,201 +70527235959;||| 21998|||70049,,20170428154818,20170527235959;70070,,20170428154739,201 +70527235959;70071,,20170428154739,20170527235959;|||

            ...Disinformation is not as good as datinformation.               Don't document the program; program the document.

      Dear, I ran a test with this command line and it did not work. See below:

      cat teste.txt

      21997|||70049,,20170428154818,20170527235959||| 21997|||70070,,20170428154739,20170527235959||| 21998|||70049,,20170428154818,20170527235959||| 21998|||70070,,20170428154739,20170527235959||| 21998|||70071,,20170428154739,20170527235959||| perl -an '-F/\|\|\|/' -e 'm/\d/ or do{print qq<$k|||$v|||\n> ;$k=$v="" +;next};$k=$F[0];$v.=$F[1].qq|;|' teste.txt </p> Not Result for this command. My perl version is: perl -v This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi
        Your data does not have the blank line separating entries with the same ID.

        My code also requires a trailing line separator.

                ...Disinformation is not as good as datinformation.               Don't document the program; program the document.

        His code depends on having a blank line between the 21997s and 21998s (as you show in your original post), and one at the end of the file. This limitation is easily fixable.
Re: Doubts group data
by thanos1983 (Parson) on May 04, 2017 at 10:40 UTC

    Hello leoberbert,

    As the monks already stated, we do not know exactly what you are trying to do. It would help us a lot to help you coming up with a solution to your problem if you provided us some steps/description etc.

    Well I put together a simple script on how I would approach your problem, but I am not sure about the calculation that you are doing with the numbers so I left them unchanged. I am sure that you can make the calculation your self and update the code.

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $path_to_file = 'test.txt'; open my $fh, '<', $path_to_file or die "Could not open " . $path_to_file . " $!\n"; chomp(my @lines = <$fh>); close $fh or die "Could not close " . $path_to_file . " $!\n"; # Remove empty lines if this is desired? @lines = grep /\S/, @lines; my %HoA; foreach my $line (@lines) { my ($match, $remaining) = split(/\|\|\|/, $line); push (@{$HoA{$match}}, $remaining); } print Dumper \%HoA; my @updated_lines; foreach my $key (keys %HoA) { my $concat_str; foreach my $i ( 0 .. $#{ $HoA{$key} } ) { if ($i == 0){ $concat_str .= $HoA{$key}[$i] . "; "; } $concat_str .= $HoA{$key}[$i] . " "; } # Trim white space on right $concat_str =~ s/\s+$//; push @updated_lines, $key.$concat_str.'|||'; } print Dumper \@updated_lines; __END__ $ perl test.pl $VAR1 = { '21997' => [ '70049,,20170428154818,20170527235959', '70070,,20170428154739,20170527235959' ], '21998' => [ '70049,,20170428154818,20170527235959', '70070,,20170428154739,20170527235959', '70071,,20170428154739,20170527235959' ] }; $VAR1 = [ '2199770049,,20170428154818,20170527235959; 70049,,201704281 +54818,20170527235959 70070,,20170428154739,20170527235959|||', '2199870049,,20170428154818,20170527235959; 70049,,201704281 +54818,20170527235959 70070,,20170428154739,20170527235959 70071,,2017 +0428154739,20170527235959|||' ];

    Update: Sorry because of rush I did not resolve correctly the output. See new code bellow:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $path_to_file = 'test.txt'; open my $fh, '<', $path_to_file or die "Could not open " . $path_to_file . " $!\n"; chomp(my @lines = <$fh>); close $fh or die "Could not close " . $path_to_file . " $!\n"; # Remove empty lines if this is desired? @lines = grep /\S/, @lines; my %HoA; foreach my $line (@lines) { my ($match, $remaining) = split(/\|\|\|/, $line); push (@{$HoA{$match}}, $remaining); } print Dumper \%HoA; my @updated_lines; foreach my $key (keys %HoA) { my $concat_str; foreach my $i ( 0 .. $#{ $HoA{$key} } ) { if ($i == 0){ $concat_str .= $HoA{$key}[$i] . "; "; } else { $concat_str .= $HoA{$key}[$i] . " "; } } # Trim white space on right $concat_str =~ s/\s+$//; my $final_str = join ('', $key,'|||' , $concat_str, '|||'); push @updated_lines, $final; } print Dumper \@updated_lines; __END__ $ perl test.pl $VAR1 = { '21997' => [ '70049,,20170428154818,20170527235959', '70070,,20170428154739,20170527235959' ], '21998' => [ '70049,,20170428154818,20170527235959', '70070,,20170428154739,20170527235959', '70071,,20170428154739,20170527235959' ] }; $VAR1 = [ '21997|||70049,,20170428154818,20170527235959; 70070,,201704 +28154739,20170527235959|||', '21998|||70049,,20170428154818,20170527235959; 70070,,201704 +28154739,20170527235959 70071,,20170428154739,20170527235959|||' ];

    Hope this helps.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hello thanos1983, Thanks for your help. I have been able to solve and understand the idea provided. Regards,
Re: Doubts group data
by tybalt89 (Monsignor) on May 04, 2017 at 13:58 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1189470 use strict; use warnings; $_ = do { local $/; <DATA> }; 1 while s/^(\d+\|{3}).*\K\|{3}\n\1/; /m; print; __DATA__ 21997|||70049,,20170428154818,20170527235959||| 21997|||70070,,20170428154739,20170527235959||| 21998|||70049,,20170428154818,20170527235959||| 21998|||70070,,20170428154739,20170527235959||| 21998|||70071,,20170428154739,20170527235959|||

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1189470]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-24 20:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found