Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

REGEX omit dashes - simple but ...

by wrkrbeee (Scribe)
on Apr 04, 2016 at 16:57 UTC ( [id://1159510] : perlquestion . print w/replies, xml ) Need Help??

wrkrbeee has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perl Monks, have a simple regex issue where I wish to omit dashes from the following numeric variable: 0001144204-09-017358 . Simple, but clearly I lack the skills to get 'er done. Relevant code appears below. As is, the code extracts everything prior to the first dash. I need the entire expression (i.e., all digits in continuous fashion without the dashes, 000114420409017358 ). This variable is numeric. I'm embarrassed to say the least. Thanks for tolerating me!

my $access_num=-99; #several other lines betweeen here, omitted for brevity; if($line=~m/^\s*ACCESSION\s*NUMBER:\s*(\d*)/m){$access_num=$1;}

Replies are listed 'Best First'.
Re: REGEX omit dashes - simple but ...
by toolic (Bishop) on Apr 04, 2016 at 17:02 UTC
    One way is to use tr
    use warnings; use strict; my $num = '0001144204-09-017358'; print "$num\n"; $num =~ tr/-//d; print "$num\n"; __END__ 0001144204-09-017358 000114420409017358
      Thanks toolic! Are you defining the variable as a string rather than a numeric? Thanks!
        Not that it really matters, but 0001144204-09-017358 is not a number but a string.


        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        My blog: Imperial Deltronics
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: REGEX omit dashes - simple but ...
by AnomalousMonk (Archbishop) on Apr 04, 2016 at 17:32 UTC
    ... the code extracts everything prior to the first dash. I need the entire expression ...

    I don't know if | Evidently you're still stuck on the "extraction" part, but if so, something like this might work:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $line = qq{foo \n ACCESSION NUMBER: 0001144204-09-017358 bar}; print qq{[[$line]]}; ;; my $rx_acc_num = qr{ \d+ (?: - \d+)* }xms; ;; my $acc_num; if ($line =~ m{ ^ \s* ACCESSION \s* NUMBER: \s* ($rx_acc_num) }xms) { $acc_num = $1; $acc_num =~ tr/-//d; } print qq{'$acc_num'}; " [[foo ACCESSION NUMBER: 0001144204-09-017358 bar]] '000114420409017358'

    Update: Please see perlre, perlretut, and perlrequick.

    Give a man a fish:  <%-{-{-{-<

      Sorry for OT but how do you accomplish the multiline commands in the command prompt under windows? I tried it many times, it should work with caret symbol at the end of a line but in my case (Windows 10) it does not. Thank you!

        The app I use is the first | second thing in my scratchpad, originally addressed to Athanasius. Please pay attention to the "shortcomings" section. The whole point originally was to quickly copy some brief code to the Windows clipboard, maybe edit it there with a clipboard editor, then paste it into a command-line window and see if it runs. I think my approach today would be different, but if you feel you could use it or any part of it, please feel free. (Update: The app runs on Windows 7 and I think I had it running on XP originally. What happens on 8 or 10 is anybody's guess.)

        Give a man a fish:  <%-{-{-{-<