Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Help with a regex

by mavericknik (Sexton)
on Jul 24, 2015 at 08:02 UTC ( #1136132=perlquestion: print w/replies, xml ) Need Help??

mavericknik has asked for the wisdom of the Perl Monks concerning the following question:

Hello! I'm pretty new to perl and trying to write a regex to match a sequence. Basically, I have a number of cells and for each cell, a number of pins. The pins are listed as :
Pin: U343.IN1 in Pin: U713.INP out Pin: U714.QN
The one with out at the end or with nothing at the end are the output pins that I want to match. So I got to:
/^[ ]+Pin:[ ]*([^\.]*)\.([^ ]*)$/ --> Matches " Pin: U714.QN" $1 = U714 $2 = QN and /^[ ]+Pin:[ ]*([^\.]*)\.([^ ]*)[ ]out/ --> Matches " Pin: U713.INP out" $1 = U713 $2 = INP
Is there a way to combine these two so that it matches both? I tried using "|" to or them
^[ ]+Pin:[ ]*([^\.]*)\.([^ ]*)$|^[ ]+Pin:[ ]*([^\.]*)\.([^ ]*)[ ]out
But if the second part of the expression matches, my () come in under $3 and $4. Basically, I want to match :
Pin: U713.INP"endofstring or out" With $1 = U713 $2 = INP
I'm sure there's some simple way to do this and im over complicating it but I cant figure it out for the life of me. A nudge in the rigth dorection would be greatly appreciated. Thanks!

Replies are listed 'Best First'.
Re: Help with a regex
by Athanasius (Archbishop) on Jul 24, 2015 at 08:16 UTC

    Hello mavericknik,

    I think you’re looking for something like this:

    #! perl use strict; use warnings; while (<DATA>) { print "$1.$2\n" if m{ ^ \s* Pin: \s* ([^.]*?) \. (\S*) (?: \s+ out +)? $ }x; } __DATA__ Pin: U343.IN1 in Pin: U713.INP out Pin: U714.QN

    Output:

    18:14 >perl 1318_SoPW.pl U713.INP U714.QN 18:14 >

    Update: Some notes:

    • The /x modifier makes the regex easier to read.
    • The character classes \s and \S represent whitespace and non-whitespace characters, respectively.
    • There is no need to escape a . inside a character class.
    • The *? quantifier is non-greedy, so if there’s a second . character in the string the literal \. will still match the first one.
    • The (?: ...)? construct matches the contents optionally, and without capturing.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      That is perfect. Thank you very much!
Re: Help with a regex
by choroba (Cardinal) on Jul 24, 2015 at 09:07 UTC
    To combine capturing regular expressions while keeping numbering their matches from $1, you can use the "branch reset" pattern (?|...) (Perl version 5.10+), see Extended Patterns.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Help with a regex
by Laurent_R (Canon) on Jul 24, 2015 at 09:02 UTC
    I dunno if this fits your case, but I often prefer, in such cases, to start by removing the cases that I do not want using for example two regexes:
    while (<DATA>) { next if /in\s*$/; print "$1.$2\n" if /^\s*Pin:\s+(\w+)\.(\w+)/; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1136132]
Approved by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2022-11-28 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?