Re^5: Reg Exp to handle variations in the matched pattern

I think this does what you want, except I've put in "<stuff>" where you had "$$"-- when I use Perl to tag text I tend to put in HTML or XML-like tags and then use an XML or HTML parser to extract a data structure to stick into a database or whatever.

open(MYINPUTFILE, "<sdnew02.txt");

while (<MYINPUTFILE>){
$_ =~ s/(\s-)$/$1\<stuff\>/;
$_ =~ s/(:)$/$1\<stuff\>/;

print $_,"";
}
[download]

You seem to have gotten hung up on worrying about the returns or newlines, when you should have recognized that you needed the end of line anchor. If you want to make the replacement more robust you could put in some matches to arbitrary amounts of whitespace before and after the "-" or ":", but before the $ anchor.

From what you describe, Perl would probably do all the text munging you need. Databases are great for randomly accessing data based on whatever relationships you want to select on, but Perl is hard to beat for dismantling text. Most of what I use Perl for is taking apart text and sticking it into databases for other purposes. Friedl's book "Mastering Regular Expressions" is still a great place to start. There are probably free tutorials floating around the web, but MRE gives clear explanations and gets you up to speed fast.

Comment on Re^5: Reg Exp to handle variations in the matched pattern Download Code

Replies are listed 'Best First'.
Re^6: Reg Exp to handle variations in the matched pattern by markjrouse (Initiate) on Feb 23, 2012 at 10:48 UTC
Thanks for this. This is a great help. Do you happen to have an example of code that you would use to tag a text file? I like the idea of tag with HTML/XML style tags, but I don't have time to build something, so maybe I'll use Perl to convert this text file to a delimited file and use a db to extract text.	[reply]
Re^7: Reg Exp to handle variations in the matched pattern by bitingduck (Chaplain) on Feb 23, 2012 at 17:03 UTC
That pretty much was code to tag a text file: `open(MYINPUTFILE, "<sdnew02.txt"); while (<MYINPUTFILE>){ $_ =~ s/(\s-)$/\<tag\>$1<\/tag\>/g; $_ =~ s/(:)$/\<tag\>$1<\/tag\>/g; print $_,""; }` [download] All I've done is wrap tags around the found object and stick in a global modifier. replace the search regex and tags with whatever you want to tag. It won't quite work around your line breaks, but you can start from there.	[reply] [d/l]


Pathologically Eclectic Rubbish Lister
	PerlMonks