Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Parsing a Word placed between special characters

by pillaipraveen (Initiate)
on May 01, 2013 at 18:11 UTC ( #1031624=perlquestion: print w/replies, xml ) Need Help??
pillaipraveen has asked for the wisdom of the Perl Monks concerning the following question:

Thanks to everyone who responded. The query was in fact obscure since I didn't use the <code> tag. Please find the query in a proper format below.

I have a file called names.txt with the following content
john[JM]mcgroddy
stephen[SG]gomsey
yuri[YA]alchenko

I was trying to extract the initials placed between special characters  [ and ] and output this to a new file initials.txt as

JM
SG
YA

.

I hope that I have clarified the question. Thanks once again to those who guided me the correct regex. Regards, PS.

Replies are listed 'Best First'.
Re: Parsing a Word placed between special characters
by kennethk (Abbot) on May 01, 2013 at 18:19 UTC

    Welcome to the Monastery. In the future, please wrap input in <code> tags so it doesn't get mangled. Note how chunks of your post got linkified.

    The basics of your question are documented in Extracting matches in perlretut. The only trick, assuming you mean to be using [ and ] as delimiters, is that these characters have special meaning in regular expressions, and thus must be escaped. Your regular expression might look something like /\[(.*?)\]/.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Again assuming the explored string is in the $_ special variable, the words between square brackets can be retrieved as follows:

      $name = $1 if /\w+\[(\w+)\]\w+/;
      Thanks to everyone who responded. The query was in fact obscure since I didn't use the <code> tag. Please find the query in a proper format below.

      I have a file called names.txt with the following content
      john[JM]mcgroddy
      stephen[SG]gomsey
      yuri[YA]alchenko

      I was trying to extract the initials placed between special characters  [ and ] and output this to a new file initials.txt as

      JM
      SG
      YA

      .

      I hope that I have clarified the question. Thanks once again to those who guided me the correct regex. Regards, PS.

Re: Parsing a Word placed between special characters
by davido (Archbishop) on May 01, 2013 at 18:37 UTC

    I couldn't tell from the formatting in your post whether your target strings look like abc\[def\]ghi, or if the backslashes were just your attempt to escape the bracket within the post. I'm going to assume that the backslash isn't intended to be part of the actual target strings, but if it is, you will just have to modify the regex by adding "\\" where appropriate.

    my @strings = ( 'test[abc]test', 'test[cde]test', 'test[ack]test', ); my $regex = qr/ \b\[ # Require a word boundary and match an open bracket. (\w+) # Capture one or more "word" characters. \]\b # Match a close bracket and require a word boundary. /x; foreach my $string ( @strings ) { if( my( $name ) = $string =~ $regex ) { print "$name\n"; } }

    Spend a few minutes with perlrequick and perlretut.


    Dave

Re: Parsing a Word placed between special characters
by toolic (Bishop) on May 01, 2013 at 18:22 UTC
Re: Parsing a Word placed between special characters
by Laurent_R (Abbot) on May 01, 2013 at 18:52 UTC

    I guess the OP must have change the original post, since I do not see any square bracket in the original post right now.

    The line now look like this

    test\abc\test

    As it is now, I would suggest the following (assuming the explored string is in the $_ special variable):

    $name = $1 if /test\\(\w+)\\test/;

    But I am not sure that the description of the original string is really adequate.

      If someone posts \[abc\] without <code> tags, everyone will see \abc\. What's unclear to me is if the backslashes are significant or not.


      Dave

        I agree, the input is unclear.
Re: Parsing a Word placed between special characters
by hdb (Prior) on May 02, 2013 at 13:35 UTC

    Having the advantage of the well formatted question, I favor split over regex:

    use strict; use warnings; while(<DATA>){ print +( split /\]|\[/ )[1], "\n"; } __DATA__ john[JM]mcgroddy stephen[SG]gomsey yuri[YA]alchenko

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1031624]
Approved by toolic
Front-paged by toolic
help
Chatterbox?
[ambrus]: Corion: I think I parsed a HTTP header from a string with LWP once. You can definitely use that to create a HTTP message too. The problme is
[ambrus]: that if you do that, you'd have to find where each HTTP response ends, which is nontrivial if you want persistent connections (essential for performance if you have small requests).
[Corion]: ambrus: Yes, ideally an API that I feed the incoming data piece by piece and that I can ask "is that response done" and "what should I do next" and "please construct the appropriate redirect for me"
[Corion]: ambrus: Yes, ideally the module would do all that nasty stuff for me and give me a way to ask it what the current situation is
[ambrus]: Corion: you could also consider using some wrapper over the multi interface of curl, I think Net::Curl might be a good one, since implementing enough of what it expects from the event loop might be easier than a full AnyEvent interface.
[ambrus]: Corion: you could also consider using IO::Async and its POE driver and some HTTP module for it, but I don't know if that would solve your problems with AnyEvent+POE
[ambrus]: Corion: wait, you didn't say POE. You said Prima, let me look up what that is.
[ambrus]: Corion: have you considered just writing an AnyEvent integration for that thing?
[ambrus]: Or perhaps pushing schmorp to write one?
[ambrus]: Also, searching for an existing one on CPAN obviously

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (11)
As of 2016-12-07 16:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:













    Results (130 votes). Check out past polls.