Re: extracting words with certain characters

by Ratazong (Monsignor)
on Dec 04, 2012 at 10:47 UTC

in reply to extracting words with certain characters

The following code may get you started:

my $words = "y____x_z dddetr x_y erre yyy_"; while ($words =~ /([\S]*_[\S]*)/g) { print "$1\n"; };
It looks for any words containing an underscore-character and prints them. Due to the loop and the /g-modifier it will even all words containing an underscore

I intentionally wrote get you started, as there are some basic assumptions inside, e.g.:

  • the "instance names" are seperated by whitespaces (and not commas ...)
  • more than one underscore is OK, and the underscores may follwo each other
  • an instance-name consisting only of underscores is fine
This webpage may help you understanding the regex.

HTH, Rata

Replies are listed 'Best First'.
Re^2: extracting words with certain characters
on Dec 04, 2012 at 11:00 UTC

    Note: this will match "words" containing anything that is not whitespace. This may or may not be what you want. For example, should this be two different words?


    If so, you could use a regex that matches a more traditional definition of a word:


    \w matches "word characters": alphanumeric characters plus underscores.

    When's the last time you used duct tape on a duct? --Larry Wall

