Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Substituting with match containing newline and space characters

by MorayJ (Beadle)
on Mar 30, 2015 at 12:19 UTC ( [id://1121812]=perlquestion: print w/replies, xml ) Need Help??

MorayJ has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to do a substitution in a file

I am trying to match:

<input type="checkbox" ng-model="data.mydebts.loan[0]"

I then want to switch it out so it says

<input type="checkbox" ng-model="data.mydebts.loan[0]" label="loan"

I am trying the following code:

while ($my_file =~ /(input type=\"checkbox\".*?(\w+)\[0\])/sg) { $my_file =~ s/$1/$1 label="$2"/s; };

But, although the match works when I print it out ($1 prints the line I want to match), it doesn't work when I use it in the substitution.

Even if I replace the substitution with a literal switch like

s /$1/TEST/s;

I think this is because the match($1) has a newline and some spaces in it

If that's the case, how do I tell it to use exactly what's in the match in the substitution (and why doesn't it do that anyway)?

Or have I probably made a mistake somewhere?

Thanks for your help

MorayJ

Replies are listed 'Best First'.
Re: Substituting with match containing newline and space characters
by choroba (Cardinal) on Mar 30, 2015 at 13:12 UTC
    You aren't trying to parse HTML with regular expressions, are you?

    Use a proper HTML handling module. For example, XML::XSH2, a wrapper around XML::LibXML:

    open :F html file.html ; for //input[@type="checkbox"] { $attr = @*[xsh:matches(.,'\[0\]$')] ; $name = xsh:match($attr, '.*\.(.*)\[0\]$') ; if $name set @label $name ; } save :F html :b ;
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Unfortunately, yes...I just wanted to be quick and dirty and all I've achieved so far is dirty. Thanks for the example...may have to turn this ship around.
Re: Substituting with match containing newline and space characters
by kroach (Pilgrim) on Mar 30, 2015 at 13:18 UTC

    Using a single regex to substitute the match seems to work:

    $my_file =~ s/(input type=\"checkbox\".*?(\w+)\[0\]")/$1 label="$2"/gs;

    The only modification I made to your original expression is adding " at the end of the outermost capture group.

      Thanks, I need to go through the file and that didn't work for me in the 'while'...

      while ($my_file =~ /(input type=\"checkbox\".*?\.(\w+)\[0\]\")/sg) { my $sub = $1; my $sub1 = $2; $my_file =~ s/\Q$sub\E/\Q$sub\E id=\"$sub1\"/;

      This gets the code swapping out, but is now littered with escape characters...may go down the parser route, but good to know about the escaping.

      Thanks for your help

        You really should use an HTML parser to handle this job, but the following regular expression and substitution "trick" might just be robust enough to handle your data:

        use strict; use warnings; my $data = do { local $/; <DATA> }; $data =~ s/ (<input.*?ng-model=") ([^[]+) (\[\d+\]") (.*?) / $1 . $2 . $3 . qq( label="@{[ (split '\.', $2)[-1] ]}") . $4 /exgs ; print $data; __DATA__ <input type="checkbox" ng-model="data.mydebts.loan[0]" <input type="checkbox" ng-model="foo.bar.baz.qux[2]" /> <input type="checkbox" ng-model="foo[300]" >

        Results:

        <input type="checkbox"
            ng-model="data.mydebts.loan[0]" label="loan"
        
        <input type="checkbox"
            ng-model="foo.bar.baz.qux[2]" label="qux" />
        
        <input type="checkbox"
            ng-model="foo[300]" label="foo" >
        

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1121812]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-24 01:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found