Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Regex substitution problem

by mahira (Acolyte)
on Feb 14, 2009 at 12:02 UTC ( [id://743807]=perlquestion: print w/replies, xml ) Need Help??

mahira has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a problem with a substition

The text is;

From: aaa@bb.netxxxMessage-Id: xxx<200902131530.n1DFUZ5l001303@aaa.bb.net>xxxTo: xxx@yyy.netxxxSubject: Size atanan xxx numarali ticket hakkindaxxxContent-Type: text/plain; charset=ISO-8859-9xxxContent-Transfer-Encoding: 7bit

The substitution regex is;

$to_email =~ s/.+To:\s(.*)xxx/\1/;

The result is;

xxx@yyy.netxxxSubject: Size atanan xxx numarali ticket hakkindaxxxContent-Type: text/plain; charset=ISO-8859-9xxxContent-Transfer-Encoding: 7bit

The result must be;

xxx@yyy.net

Please help... Thanks in advance.

Replies are listed 'Best First'.
Re: Regex substitution problem
by almut (Canon) on Feb 14, 2009 at 13:05 UTC
    $to_email =~ s/.+To:\s(.+?)xxx.*/\1/;

    In addition to the non-greediness change suggested by jethro, you'd also need a trailing .* to match the remainder of the line, because only what matched will be subsituted.

    Also, you'd need to take care that the 'xxx', which you use to locate the end of the mail address, will not itself be part of the mail address.  (Not sure of how much relevance that potential problem is here, so as a quick fix, I simply changed .*? into .+? — the .+ consumes one of the x'ses, so the first occurrence of 'xxx' will no longer be a candidate for the terminating 'xxx'...)

      This works! Thank you and everyone that helps...

      ps: The xxx is just a placeholder. The mail adress itself will be different. It is my fault to replace the line endings with xxx's and using the same for the e-mail :)

Re: Regex substitution problem
by jethro (Monsignor) on Feb 14, 2009 at 12:51 UTC
    $to_email =~ s/.+To:\s(.*?)xxx/\1/;

    will do what you want. The * is greedy which means it will take as much as it can. The '?' changes the * to a non-greedy version

      Thanks. But the result changes slightly

      xxx@yyy.netSubject: Size atanan xxx numarali ticket hakkindaxxxContent-Type: text/plain; charset=ISO-8859-9xxxContent-Transfer-Encoding: 7bit

      (the xxx near the e-mail address disappeared)

Re: Regex substitution problem
by ww (Archbishop) on Feb 14, 2009 at 13:20 UTC

    TIMTOWTDI (unnecessarily verbose, but perhaps easy to follow and illustrating some possibilities with lookaheads):

    #! /usr/bin/perl use strict; use warnings; my $text = 'From: aaa@bb.netxxxMessage-Id: xxx<200902131530.n1DFUZ5l00 +1303@aaa.bb.net>xxxTo: xxx@yyy.netxxxSubject: Size atanan xxx numaral +i ticket hakkindaxxxContent-Type: text/plain; charset=ISO-8859-9xxxCo +ntent-Transfer-Encoding: 7bit'; if ( $text =~ /(?=To:\s)(.+)(?=[^x]{3})/ ) { $text =~ /.+To:\s(.*?)(?=xxxSubject)/; my $to_email = $1; print $to_email . "\n"; } else { print "no match\n"; }

    Output:

    xxx@yyy.net

    as specified.

    Your OP would have been easier to parse had you included data and code inside <c>...</c> tags as done above.

Re: Regex substitution problem
by AnomalousMonk (Archbishop) on Feb 14, 2009 at 13:26 UTC
    In the replacement field of a substitution, the capture variable  $1 and not the backreference  \1 should be used.  use warnings; would have suggested this.

    Also, it is not clear if the text you give in your post is multi-line, i.e., if it has embedded newlines, or if it is a single string (which seems most likely). In the former case, use a /s regex modifier (see perlre section Modifiers) which allows the  . (dot) metacharacter to match newlines.

Re: Regex substitution problem
by Anonymous Monk on Feb 15, 2009 at 03:13 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://743807]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-19 23:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found