Re: extract text between slashes

Your current attempt is good, but the match is greedy. It looks for the longest possible match, not the shortest. Adding a ? after the .+ would work fine. To understand "greedyness," check the perldocs for regular expressions: perlre

m{/(.*?)/}
[download]

A second issue is if the string has empty slots between slashes, such as the string "%///US1252691001". You probably want to be able to return an empty result in this case, so I changed your use of .+ (one or more) to .* (zero or more) characters. Otherwise, you might get a match back of "/" for strings like my example.

Update: As others mentioned but I didn't parse correctly, to get the THIRD field (e.g., "~/~/THIS/~") takes a little more work. Instead of a bunch of complicated lookaheads and lookbehinds, or switching to a split() instead, I would just parse through. This has the advantage of easily changing the pattern to capture the other fields if the requirements change.

m{/.*?/(.*?)/}
[download]

--
[ e d @ h a l l e y . c c ]

Comment on Re: extract text between slashes Select or Download Code

Replies are listed 'Best First'.
Re^2: extract text between slashes by johngg (Canon) on Oct 31, 2007 at 16:44 UTC
Adding a ? after the .+ would work fine I might be missing something but I don't think that will work as desired. It will return the first item between slashes which is `%`, not `ISIN`. I think split might be better here. Something like `my $str = '%/%/ISIN/US1252691001'; my @elems = split m{/}, $str; my $isin = $elems[2];` [download] Cheers, JohnGG Update: You need a more complex regex to do this without `split` using zero-width look-around assertions, an alternation of two look-behinds and a look-ahead with an alternation. `my @elems = $str =~ m{(?(?<=\A)\|(?<=/))(.*?)(?=/\|\z)}g;` [download]	[reply] [d/l] [select]
Re^2: extract text between slashes by EvanK (Chaplain) on Oct 31, 2007 at 17:00 UTC
Also, keep in mind that he wants the contents of the second pair of slashes. Assuming that the first one with the percent sign is static, `m{/\%/(.?)/}` might work. otherwise, he could grab all matches and filter out the wrong ones, or split the whole string beforehand: `# method 1 @matches = $string =~ m{/(.?)/}g; # method 2 @matches = split m{/}, $string; # print the one you want print $matches[1];` [download] __________ Systems development is like banging your head against a wall... It's usually very painful, but if you're persistent, you'll get through it.	[reply] [d/l] [select]
Re^3: extract text between slashes by johngg (Canon) on Oct 31, 2007 at 19:50 UTC
Unfortunately, your method 1 isn't going to do the trick because the regex is going to consume `%/%/` when doing the first match and the next attempted match is left with `ISIN/US1252691001` to work with so the match fails. `$ perl -le ' > $string = q{%/%/ISIN/US1252691001}; > @matches = $string =~ m{/(.*?)/}g; > print for @matches;' % $` [download] Cheers, JohnGG	[reply] [d/l] [select]
Re^2: extract text between slashes by RaduH (Scribe) on Oct 31, 2007 at 17:21 UTC
I think we don't know enough about what he's looking for. It was said that he's looking for the text between the second pair of slashes. What if he is looking for the string between the last %/ and the very next / ? I think the input string is not described well enough in the original question. For all I can say, he could be looking for %/%/ as a fixed token, suck out all of the following characters until the first /, but this assumes all his input strings begin with %/%/ followed by what he needs to extract, which may not be a correct assumption.	[reply]


Think about Loose Coupling
	PerlMonks