Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Substitution don't work with a special inputfile

by OldChamp (Acolyte)
on Aug 24, 2015 at 12:36 UTC ( #1139672=perlquestion: print w/replies, xml ) Need Help??

OldChamp has asked for the wisdom of the Perl Monks concerning the following question:

The following code works with with a test-inputfile, but not with the real inputfile

# Aufruf: perl removeEK1.pl TestEK.txt > Out.txt use strict; use warnings; my $regex = '\{\[%tqu.*]}'; my $subst = ''; while(<>) { my $line =$_; $line =~ s/$regex/$subst/gi; print $line; }

That worked fine with the following file TestEK.txt:

[Event "?"] [Site "?"] [Date "1985.??.??"] [Round "?"] [White "Neuenschwander, Beat"] [Black "?"] [Result "1-0"] [Annotator "Solution"] [SetUp "1"] [FEN "8/5ppk/8/3p2KP/3P2P1/8/8/8 w - - 0 1"] [PlyCount "17"] [Source "ChessCafe/CB"] [SourceDate "2003.10.29"] BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla 1. Ka6 ({Of course not} 1. b6 $2 Kb7 $11) 1... Kb8 (1... f4 2. b6 $18) + {[%tqu "What is White's next move?","","",g3,"",0,b6,"misses the wi +n:",0]} 2. g3 $1 13. g6 c3 BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla 1. Ka6 ({Of course not} 1. b6 $2 Kb7 $11) 1... Kb8 (1... f4 2. b6 $18) + {[%tqu "What is White's next move?","","",g3,"",0,b6,"misses the wi +n:",0]} 2. g3 $1 13. g6 c31. Ka6 ({Of course not} 1. b6 $2 Kb7 $11) 1 +... Kb8 (1... f4 2. b6 $18) {[%tqu "What is White's next move?",""," +",g3,"",0,b6,"misses the win:",0]} 2. g3 $1 13. g6 c31. Ka6 ({Of cour +se not} 1. b6 $2 Kb7 $11) 1... Kb8 (1... f4 2. b6 $18) {[%tqu "What i +s White's next move?","","",g3,"",0,b6,"misses the win:",0]} 2. g3 $ +1 13. g6 c3 1. Ka6 ({Of course not} 1. b6 $2 Kb7 $11) 1... Kb8 (1... f4 2. b6 $18) + {[%tqu "What is White's next move?","","",g3,"",0,b6,"misses the wi +n:",0]} 2. g3 $1 13. g6 c3 BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaB +laBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla

I got the output I wanted, but when I tested it with a part of my real inputfile, it failed.

....... [Event "?"] [Site "?"] [Date "1933.??.??"] [Round "?"] [White "Grigoriev, Nikolay"] [Black "?"] [Result "*"] [Annotator "Solution"] [SetUp "1"] [FEN "k7/2p5/8/KP3p2/8/8/6P1/8 w - - 0 1"] [PlyCount "13"] [Source "ChessCafe/CB"] [SourceDate "2003.10.29"] 1. Ka6 ({Of course not} 1. b6 $2 Kb7 $11) 1... Kb8 (1... f4 2. b6 $18) {[%tqu "Wha +t is White's next move?","","",g3,"",0,b6,"misses the win:",0]} 2. g3 $1 ({ +The hasty } 2. b6 $2 {misses the win:} Kc8 $1 {with the idea 3...cxb6.} 3. b7+ K +b8 4. g3 c5 5. Kb5 Kxb7 6. Kxc5 Kc7 7. Kd5 f4 $1 8. gxf4 Kd7 $11 {Black saves t +he game by seizing the opposition.}) 2... Ka8 ({Another defensive method also +does not help} 2... Kc8 3. Ka7 Kd8 4. Kb8 $1 {(an opposition!)} Kd7 5. Kb7 Kd8 +(5... Kd6 6. Kc8 $18) 6. Kc6 {(an outflanking!)} Kc8 7. Kd5 Kb7 8. Ke5 Kb6 9. Kx +f5 Kxb5 10. g4 c5 11. g5 c4 12. Ke4 $1 {(we shall see this method - an enticem +ent of the hostile king under a check - more than once in this book)} Kb4 13. + g6 c3 14. Kd3 $1 Kb3 15. g7 c2 16. g8=Q+) {[%tqu "What is White's next move? +","","", b6,"",0]} 3. b6 Kb8 { } 4. Kb5 $1 (4. b7 $2 c5 5. Kb5 Kxb7 $11) 4... Kb7 5. bxc7 Kxc7 {[%tqu + "What is White's next move?", "","",Kc5,"",0]} 6. Kc5 Kd7 {[%tqu "What is White's next move?","","", +Kd5, "This time White has seized the opposition, therefore the pawn sacrifi +ce 7... f4 is senseless.",0]} 7. Kd5 $18 {This time White has seized the oppos +ition, therefore the pawn sacrifice 7...f4 is senseless.} * [Event "?"] [Site "?"] .......

My outputfile now was the same as the inputfile, the searchtext was obviously not found and therefor not removed!! I'm not able to spot what is going wrong. Pherhaps anyone has an idea? Help would be very much apreciated!

Replies are listed 'Best First'.
Re: Substitution don't work with a special inputfile
by Eily (Monsignor) on Aug 24, 2015 at 12:51 UTC

    Your second file seems to have line breaks ("\n") in the middle of the {[%tqu...]} strings, and the dot does not match this character unless you use the /s modifier. Try s/$regex/$subst/sgi; instead

    Edit: except .* with /s will be far too greedy. [^]]* or .*? would work better

      Hi Eily, thank you for your help. I have modified the code like you have suggested, but it still doesn't work.

        Yes, I forgot that you were reading line by line, so even after my modification this won't work. But as ww already told you, you have the answer to that question in your previous topic on the subject

Re: Substitution don't work with a special inputfile
by ww (Archbishop) on Aug 24, 2015 at 15:40 UTC

    Compare to your original post of this content Re^2: Substitution don't work ... and the answer there.

    Or, put another way, make sure you read, heed and understand answers before re-posting essentially the same question. Downvoted.

    Also, when following up on a prior thread, think carefully whether the remaining question in your mind deserves a new thread (yes, sometimes it does, when buried deeply (say Re: 4 or deeper)... and if it does, please x-reference the original thread.

Re: Substitution don't work with a special inputfile
by Laurent_R (Canon) on Aug 24, 2015 at 14:22 UTC
    Do you want to remove everything in your input lines that comes after [%tqu.*] (including [%tqu)? Please tell your desired result or show us part of the sample output that fir your needs.

    But if you want to remove everything from [%tqu to the end of the line, that will not work the same way with your second data sample that has additional line breaks, because data removed will go until the next line break.

      Hi Laurent, thank you for your help. I wish to delete everything between the two starting brackets and the two ending brackets including these brackests, that is every string with this pattern

      {[%tqu "What is White's next move?","","",g3,"",0,b6,"misses the win:",0]}

      should be deleted. I have modified the code like Eily suggested and I have added the /s modifier to catch the line breaks, but it still doesn't work, the outputfile is the same as the inputfile. My real data file is far bigger than the second example file, but I think if the code works with this inputfile it will work with the original file too.

        Not sure if this is what you want as it also removes the newlines and re-inserts them before the 1. 2. etc

        #!perl # Aufruf: perl removeEK1.pl TestEK.txt > Out.txt use strict; use warnings; my $regex = '{\[%tqu[^\]]*]}'; my $line = do { local $/; <>; }; $line =~ s/\n/ /g; $line =~ s/$regex//gi; $line =~ s/ (\d+\. )/\n$1/g; print $line;
        poj

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1139672]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2022-09-29 05:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (125 votes). Check out past polls.

    Notices?