http://www.perlmonks.org?node_id=1017925

MacLing has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

could you give me perl script to extract the specific string like data below.

Source file : input.txt

06.02.2013 12.24.01:909 5807225321 INFO {EXT:httpadapter:17:14:}[0]RULEZ HTTPADAPTER,msisdn:637584930382,ud:Pan,trxtime:20130206122401 ResponseDeltaTime:31 ms ResponseCode:200 ResponseBody:OK

output1.txt

06.02.2013,637584930382,Pan,20130206,200

output2.txt

06-02-2013,637584930382,Pan,2013-02-06,200

Replies are listed 'Best First'.
Re: Perl Extract specific string
by davido (Cardinal) on Feb 09, 2013 at 05:02 UTC

    We're here to help, tutor, guide, teach, pat each other on the back, brag, show off, socialize, learn, ponder, share, collaborate, argue, and distract ourselves from our real $work. Not included in that list is write free programs on demand.

    Maybe you already know this and just didn't know how to ask or what to ask for, so let's try again: What have you tried, and what part are you having trouble with? How can we help you in this learning process?


    Dave

Re: Perl Extract specific string
by CountZero (Bishop) on Feb 09, 2013 at 08:22 UTC
    Before we answer we need some more info.

    • Are these three lines of data actually ONE line or indeed THREE lines? Tip: enclose your data in <code> ... </code> tags. That makes it easier to see the actual format and also things like square brackets and such.
    • Can you show some more lines of your data? Especially when you have to write perhaps a regular expression to extract the data it is always good to see some more data, to check how "regular" your data structure really is.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

      Hi CountZero,

      It One line. and structure data is same. and there is a lots of data in each file.

      anyway thanks for your reply, i will test it. Please give me the script which can read from input file and create the output file

        Perl is such a powerful language, you only need a few lines of code:
        use Modern::Perl; while (<DATA>) { my ( $date, $msisdn, $ud, $trxtime, $response ) = map { /:([^:]+)/ ? $1 : $_ } ( split /[\s+,]/ )[ 0, 6, 7, 8, 11 +]; say join ',', $date, $msisdn, $ud, substr( $trxtime, 0, 8 ), $resp +onse; } __DATA__ 06.02.2013 12.24.01:909 5807225321 INFO {EXT:httpadapter:17:14:}[0]RUL +EZ [HTTPADAPTER],msisdn:637584930382,ud:Pan,trxtime:20130206122401 Re +sponseDeltaTime:31 ms ResponseCode:200 ResponseBody:OK
        Output:
        06.02.2013,637584930382,Pan,20130206,200

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        My blog: Imperial Deltronics
Re: Perl Extract specific string
by Anonymous Monk on Feb 09, 2013 at 09:02 UTC

    I agree, with great minds that has given there wisdom on this. However, since is your first time in the house, I will cut you some slack on the rule of the house which you can read for yourself here How do I post a question effectively?.

    I can think of three ways of getting to your desired output:

    • see Text::CSV or Test::CSV_XS. Since you presented a Comma seperated string
    • use split,conditional loop "if" and regex like this:
      #!/usr/bin/perl use warnings; use strict; my $str = <<'_STR_'; 06.02.2013 12.24.01:909 5807225321 INFO {EXT:httpadapter:17:14:}[0]RUL +EZ HTTPADAPTER,msisdn:637584930382,ud:Pan,trxtime:20130206122401 Resp +onseDeltaTime:31 ms ResponseCode:200 ResponseBody:OK _STR_ my $wanted_data; for ( split /,/, $str ) { if (/(\d{2}\.\d{2}\.\d{4})/) { my $format_date = $1; $format_date =~ s/\./-/g; $wanted_data .= $format_date; } elsif (/msisdn:(.+)/) { $wanted_data .= ',' . $1; } elsif (/ud:(.+)/) { $wanted_data .= ',' . $1; } elsif (/trxtime:(\d{8}).+Code:(\d+)/sm) { my ( $format_date, $code ) = ( $1, $2 ); $format_date =~ s/(\d{4})(\d{2})(\d{2})/$1-$2-$3/; $wanted_data .= ',' . $format_date . ',' . $code; }else{ print "data given is not true"} } print $wanted_data, $/;
    • Or you can use "dispatch table" instead of the ifs and elsifs like this:
      #!/usr/bin/perl use 5.014; ## it wouldn't work for lower version use Readonly; Readonly my $comma => ","; my $str = <<'_STR_'; 06.02.2013 12.24.01:909 5807225321 INFO {EXT:httpadapter:17:14:}[0]RUL +EZ HTTPADAPTER,msisdn:637584930382,ud:Pan,trxtime:20130206122401 Resp +onseDeltaTime:31 ms ResponseCode:200 ResponseBody:OK _STR_ my %data_filter = ( '1_first_date_format' => sub { if ( $_[0] =~ /(\d{2}\.\d{2}\.\d{4})/ ) { my $format_date = $1; $format_date =~ s/\./-/g; return ( $format_date, $comma ); } }, '2_msisdn' => sub { if ( $_[0] =~ /msisdn:(.+)/ ) { return ( $1, $comma ); } }, '3_ud' => sub { if ( $_[0] =~ /ud:(.+)/ ) { return ( $1, $comma ); } }, '4_trxtime' => sub { if ( $_[0] =~ /trxtime:(\d{8}).+Code:(\d+)/sm ) { my ( $format_date, $code ) = ( $1, $2 ); $format_date =~ s/(\d{4})(\d{2})(\d{2})/$1-$2-$3/; return ( $format_date, $comma, $code ); } }, ); for my $key ( sort keys %data_filter ) { for ( split /,/, $str ) { print $data_filter{$key}->($_); } }
    However, pay attention this "..Can you show some more lines of your data? Especially when you have to write perhaps a regular expression to extract the data it is always good to see some more data, to check how "regular" your data structure really is..." by CountZero and others before.
    Hope this help in some ways.

      Thanks CountZero !!!

      please help me again.

      #!/usr/bin/perl use warnings; use strict; my $str = <<'_STR_'; 09.05.2014 09.49.52:359 RID:routerNode1@app1:1662306081,msisdn:7878872 +25696,sid:93889095007001,tid:1405090902095648846024000,status:2,time: +20140509094952,reason:DELIVRD,refund:null,status2:null _STR_ my $wanted_data; for ( split /,/, $str ) { if (/(\d{2}\.\d{2}\.\d{4}' '\d{2}\.\d{2}\.\d{2}).+RID:(\d+)/sm) { my ( $format_date, $rid )= ( $1, $2 ); $format_date =~ s/(\d{4})(\d{2})(\d{2})/$1-$2-$3/; $wanted_data .= $format_date . ',' . $rid; } elsif (/msisdn:(.+)/) { $wanted_data .= ',' . $1; } elsif (/sid:(.+)/) { $wanted_data .= ',' . $1; } elsif (/tid:(.+)/) { $wanted_data .= ',' . $1; } elsif (/status:(.+)/) { $wanted_data .= ',' . $1; } elsif (/time:(.+)/) { $wanted_data .= ',' . $1; } else{ print "data given is not true"} } print $wanted_data, $/;

      the wanted_data like this below :

      09-05-2014,09-49-52,routerNode1@app1:1662306081,787887225696,93889095007001,1405090902095648846024000,2,20140509094952

      when run, error like below:

      data given is not truedata given is not truedata given is not truedata given is not true,787887225696,93889095007001,1405090902095648846024000,2,20140509094952

      and please advice, how to read the input file and write to the output file.

      Thank you.

Re: Perl Extract specific string
by ansh batra (Friar) on Feb 09, 2013 at 08:01 UTC

    could you give me ...
    you should avoid using this kind of language in any forum
    follow the steps given by davido
    if you have not tried anything then try it using regular expressions
    refer tutorials sections to learn about regular expressions

Re: Perl Extract specific string
by Anonymous Monk on Feb 09, 2013 at 09:28 UTC