http://www.perlmonks.org?node_id=1053523

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I want to extract the file name from the out put of ps -ef|grep java_app.

I am getting the below output which i have stored in $line variable.

$line = root 30145 1 0 Jan30 ? 00:09:01 /root/java/app/java_app_1Rule_ +java_app_2Rule.java --javaproc

Now i need to parse our "java_app_1Rule_java_app_2Rule"(name of the application).

I tried someting like  my $id = $line =~m { /\/(.+)java/ }gx; but i am not getting any output.

What am i missing here? Please help.

Replies are listed 'Best First'.
Re: extract file name from path
by RichardK (Parson) on Sep 11, 2013 at 17:23 UTC

    ps will let you return just command names, using 'ps -eo command', if that's all you want. (I guess you should check with the ps man page on your platform, just in case).

    Now that you've got less junk to skip, just use splitpath in File::Spec to get the app name, which handles all those irritating special cases ;)

Re: extract file name from path
by ww (Archbishop) on Sep 11, 2013 at 16:56 UTC
    Your regex attempt looks as though you'll find careful reading of perldoc perlre and perldoc perlretut and/or the regex tuts here useful. "Mastering Regular Expressions" would be a good followup; "Regular Expressions Pocket Reference" would often be helpful.

    That said, one fairly direct route would be a lookahead -- which tells the regex engine you want to accept matches until the match would be what's in the lookahead... in this case, app/:

    $line =~ m|.+(?=app/)(.*)\s--javaproc|

    In other words, match 1 or more of anything (dot-plus) until there's a match on app/ and then capture up until the space before the --javaproc. Note that there's no need to escape the slashes in the patch when using alternate regex delimiters.

    Update: eliminating the ".java" (per OP's spec) is left so there's some learning exercise left here.

Re: extract file name from path
by hdb (Monsignor) on Sep 11, 2013 at 17:15 UTC

    Simple, just rule out dashes from the filename:

    use strict; use warnings; my $line = "root 30145 1 0 Jan30 ? 00:09:01 /root/java/app/java_app_1R +ule_java_app_2Rule.java --javaproc"; print "$1\n" if $line =~ m|/([^/]+)\.java\s|;
Re: extract file name from path
by Laurent_R (Canon) on Sep 11, 2013 at 16:52 UTC

    What about this:

    my $id = $1 if $line =~ /(\w+)\.java\s/;
      Nope did not help.

        I posted it without being able to test. Now I am back home and can test, and it works for me as shown in this session under the Perl debugger:

        DB<1> $line = 'root 30145 1 0 Jan30 ? 00:09:01 /root/java/app/java_a +pp_1Rule_java_app_2Rule.java --javaproc'; DB<2> $id = $1 if $line =~ /(\w+)\.java\s/; DB<3> print $id; java_app_1Rule_java_app_2Rule

        And, BTW, I don 't see why I received a downvote for a solution that works.

Re: extract file name from path
by keszler (Priest) on Sep 11, 2013 at 16:50 UTC
    my $line = "root 30145 1 0 Jan30 ? 00:09:01 /root/java/app/java_app_1R +ule_java_app_2Rule.java --javaproc"; $line =~ m{.*/([^ ]+)}; # update - fixed typo of '/' to close regex my $id = $1;
    perlretut
Re: extract file name from path
by TJPride (Pilgrim) on Sep 11, 2013 at 19:19 UTC
    We can probably assume the pattern will always start with a / (since that's part of the path) and end with some sort of .ext:
    my $line = 'root 30145 1 0 Jan30 ? 00:09:01 /root/java/app/java_app_1R +ule_java_app_2Rule.java --javaproc'; if ($line =~ m|/([^/]*?)\.[a-z0-9]+|) { print "$1\n"; }

      I think that you need to add the underscore ('_') to the character class in your regex.

      Update: I read too quickly. Reading again your regex, no, there is no need to add the underscore, since this character class is aimed at matching the final "java" extension.