Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

How do I write a regex which allows meta-quoting?

by WHolcomb (Initiate)
on Apr 13, 2000 at 19:03 UTC ( [id://7467]=perlquestion: print w/replies, xml ) Need Help??

WHolcomb has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a regular expression to aid in the parsing of a configuration file and I want to allow the users of the configuration file to specify any character in their directive including the special characters for the file format (comment (#), equivalance (=), etc.) and I have been trying to write an appropriate regex with no success. I am new at this and I have tried:
s/([^(?:([^\\]|\A)\\(\\{2})*\#)]*)(.*)/$1/
which represents a # not followed by an odd number of \'s (since \\# is the \ character metaquoted followed by a comment) but that didn't work becasue ^ only represents single characters and not sets of characters. I then tried the perl 5.005 negative lookbehind (?<!) but it only allows fixed width lookbehinds and I want to allow any number of \'s. Currently I am doing:
split /\Q#\E/; $_ = $_[0]; if(/\A\s*\Z/) { next; } $string = $_; for($i = 1; $i <= $#_; $i++) { $_ = $_[$i - 1]; m/(.)((\\){2})*\Z/; if("$1" eq "\\") { $string .= "\#" . $_[$i]; } else { last; } }
Can anyone suggest a regex to do all that work? I remember seeing one to correctly parse a C string somewhere which would deal with these same issues, but hard as I look I cannot find it.

Will

Originally posted as a Categorized Question.

Replies are listed 'Best First'.
Re: How do I write a regex which allows meta-quoting?
by WHolcomb (Initiate) on Apr 13, 2000 at 19:39 UTC
    Quite nearly there. All that is left is that things in brackets like array subscripts are made into links to other nodes. That ought to be fixable by replacing them with the html codes, which I don't know off the top of my head. Ahh, they are &#91; -> [ and &#93; -> ]

    To the monks who maintain this monestary I might suggest that they have the node linking ignore []'s inside <pre>'s.
    s/(^[(?:([^\\|\A)\\(\\{2})*\#)]*)(.*)/$1/
    
    (?<!)
    
    $c = "\#";
    $m = "\\";
    
    while(<IN>) {
      chomp;
      split /\Q$c\E/;
      $_ = $_[0];
      next if(/\A\s*\Z/);
      $string = $_;
      for($i = 1; $i <= $#_; $i++) {
       $_ = $_[$i - 1];
       m/(.)((\Q$m\E){2})*\Z/;
       if("$1" eq "$m") {
         $string .= "$c" . $_[$i];
       } else {
         last;
       }
      }
    }
    

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://7467]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (3)
As of 2024-04-18 22:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found