Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

broken regex

by elwood (Initiate)
on Sep 15, 2006 at 12:35 UTC ( [id://573107]=perlquestion: print w/replies, xml ) Need Help??

elwood has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to delete words in $line that are in the array words, but it appears the regex isn't working. The regex is susposed to check each individual word, and if it exists, deleted it from $line.
foreach (@words) { $line =~ s/([%\d])$_([%\d])//g; print "$line\n"; }
Does anyone have any ideas? Thanks

Edit davorg: Added code tags

Replies are listed 'Best First'.
Re: broken regex
by davorg (Chancellor) on Sep 15, 2006 at 12:43 UTC

    (Please read Writeup Formatting Tips and format your posts so they are easier to read)

    It's hard (maybe impossible) to know what the correct regex is without knowing the format of the data that you are dealing with.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

      The data in $line is a simple sentance.
      The @words array is a list of individual words, one per entry, to remove. i.e. $line="the cat sat on the mat"
      and the @words might contain
      "on"
      "the"

        Ok. Now we're getting somewhere. In <code> tags, your code looks like this:

        foreach (@words) { $line =~ s/([%\d])$_([%\d])//g; print "$line\n"; }

        So the regex is saying this:

        Look for either a percent sign or a digit (which will be captured in $1) followed by the string in $_ followed by another percent sign or a digit (which will be captured in $2)

        And if all that is found then it's replaced by an empty string.

        So as your data doesn't appear to contain any digits or percent signs, then none of that is ever going to match. I think that what you actually wanted was far simpler.

        foreach (@words) { $line =~ s/\b$_\b//g; print "$line\n"; }

        And I think you may want to move the print statement out of the loop - tho' it might be there for debugging purposes.

        See perlretut for a good introduction to regexes.

        --
        <http://dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg

        Then you probably want your regex to be something more like s/\b$_\b//g which will match word boundaries. Just to make sure that there's nothing more that you need to do than is coming across in your message, what were you aiming to do with the %\d?

        Hays

Re: broken regex
by robartes (Priest) on Sep 15, 2006 at 12:48 UTC
    As the man said, we need your data to be able to tell something about this. From looking at your regex (mangled by the fact that it's not in code tags), it seems like it is not looking for a word, but for a word preceded and followed by a % or a digit. Is that what you intend to do?

    CU
    Robartes-

      sorry about the bad formatting, I'll try again
        foreach (@words) {
          $line =~ s/(%\d)$_(%\d)//g;
          print "$line\n";
          }
      

      It was my understanding that (%\d) detected a word boundary (space, begining of line, end of line)
        That's why we're here. \b is the word boundary metacharacter. And thanks for the code tags! :)

        Hays

        sorry about the bad formatting, I'll try again

        On PerlMonks, put <c>..</c> tags around your code. It's like <pre>..</pre>, but it handles escaping special characters for your (such as &, <, >, [ and ]). It also wraps long lines while offering a link to download the unwrapped version.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://573107]
Approved by robartes
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2025-06-14 10:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.