Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Regex from a file

by gon770 (Novice)
on Sep 04, 2012 at 22:43 UTC ( #991697=perlquestion: print w/ replies, xml ) Need Help??
gon770 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am trying to make a Perl script which will open a two text files: A.txt for news articles, B.txt for curse words.

B.txt will have bunch of cursing words separated by line(\n)

The script will read A.txt and B.txt. Searching A.txt for any of B.txt. If there is any words from B.txt, it will replace the word with XXXXX.

So far, I have:

use strict;
use warnings;
local $/;
open FH,"<article.txt";
my $string = <FH>;

$string =~ s/bitch/XXXXX/g;
$string =~ s/shit/XXXXX/g;
$string =~ s/slut/XXXXX/g;
$string =~ s/fuck/XXXXX/g;
...

open FILE, ">filtered.txt" or die $!;
print FILE $string;
close FILE;

#EOF

Can someone enlighten me how can I have those curse words in text file separately, rather than listing all those words in regular expression in the script? (words are separated by line (\n))


Thank you so much

Comment on Regex from a file
Re: Regex from a file
by choroba (Abbot) on Sep 04, 2012 at 22:54 UTC
    You can build the regex from the lines read from the file:
    #!/usr/bin/perl use warnings; use strict; open my $REG, '<', 'curse.txt' or die $!; (my $regex = join q(), <$REG>) =~ s/\n/|/g; $regex =~ s/\|$//; open my $FLT, '>', 'filtered.txt' or die $!; open my $TXT, '<', 'article.txt' or die $!; while (<$TXT>) { s/$regex/XXX/ig; print {$FLT} $_; } close $FLT or die $!;
    Note: Make yourself familiar with the Scunthorpe_problem.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Regex from a file
by davido (Archbishop) on Sep 05, 2012 at 00:00 UTC

    You might suspect that such a tool has already been created; certainly you're not the first Perl user to have the need.

    Assuming you're actually trying to solve a problem rather than just have a little fun in a public forum, you should look at Regexp::Common::profanity, Regexp::Common::profanity_us, or Regexp::Profanity::US. Those solutions are further along the development path than yours.


    Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://991697]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (9)
As of 2014-12-20 19:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (97 votes), past polls